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DESCRIPTION 
RDGB-PROTEINS 

Introduction 

The present invention relates generally to newly 
identified rdgB proteins and related products and methods. 

Background of the Invention 
5 The following discussion of the background of the 

invention and references cited therein are not admitted to 
be prior art to the invention. 

Cellular signal transduction is a fundamental 
mechanism whereby external stimuli that regulate diverse 

10 cellular processes are relayed to the interior of cells. 
One of the key biochemical mechanisms of signal transduc- 
tion involves the reversible phosphorylation of tyrosine 
residues on proteins. The phosphorylation state of a 
protein is modified through the reciprocal actions of 

15 tyrosine phosphatases (TPs) and tyrosine kinases (TKs) , 
including receptor tyrosine kinases and non-receptor 
tyrosine kinases. 

A tyrosine protein kinase named PYK2 , is described in 
U.S. patent application serial No. 08/460,626, filed June 

20 2, 1995, which is a continuation-in-part application of 
U.S. patent application Serial No. 08/357,642, filed 
December 15, 1994, both of which are hereby incorporated 
herein by reference in their entirety including any 
drawings. PYK2 contains an N-terminal domain, a catalytic 

25 domain, two proline-rich regions, potential Src homology 
2 (SH2) binding regions, and a region homologous to the 
focal adhesion targeting domain. 

A type of protein found in Drosophila , called 
Drosophila retinal degeneration B protein (rdgB) is 

30 described in Vihtelic et al . , <J. of Cell Biology 
122, : 1013 - 1022 , 1993. The sequence described in this 
reference, however, contained a false stop codon 
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sequencing error and thus the authors were not aware that 
the Drosophila rdgB contains a PYK-2 binding domain. In 
addition, this sequence was incorrectly identified as a 
member of the 6 -transmembrane domain family of proteins. 
5 These rdgB proteins function in many sensory and neuronal 
cells of the fly and are directly associated with sight in 
the fly. 

The sequence of a genomic clone of a portion of C. 
elegans has been placed on a computer database, and 

10 (although unappreciated) , this sequence contains an rdgB 
sequence with introns . Thus, the GENEBANK database 
contains raw data of the nucleotide sequence of a series 
of genomic clones of c. Elegans. Using portions of the 
human rdgb sequence, the present invention identifies an 

15 open reading frame that has been to this point 
unrecognized. An rdgB was thus found segregated into 14 
exons in two separate cosmids C54C6 (assc. #Z77131) and 
MOIFI (assc. #Z46381) . 

Summary of the Invention 

20 The present invention relates to rdgB polypeptides, 

nucleic acids encoding such polypeptides, cells, tissues 
and animals containing such polypeptides, antibodies to 
such polypeptides, assays utilizing such polypeptides, and 
methods relating to all of the foregoing. Such rdgB 

25 polypeptides are involved in various signal transduction 
pathways and thus the present invention provides several 
agents and methods useful for diagnosing, treating, and 
preventing various diseases or conditions associated with 
abnormalities in these pathways. 

30 The present invention is based in part upon the 

identification and isolation of a series of novel non- 
receptor tyrosine kinase binding molecules, termed hrdgBl, 
hrdgB2, and hrdgB3 . The full length nucleic acid 

sequences encoding these proteins are set forth 

35 respectively in SEQ ID N0:1, SEQ ID NO : 2 , and SEQ ID NO : 3 . 
The full length amino acid sequences are set forth 
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respectively in SEQ ID NO: 4, SEQ ID NO : 5 , and SEQ ID NO: 6. 
RDGBs are generally comprised of 3 structural domains. 
The N- terminal PIT domains described herein have 
approximately 45% amino acid identity to human PPI1 and 
5 PPI2. The PIT domains of RDGB2 and RDGB3 (RDGB1 lacks a 
PIT domain) have approximately 72% identity with each 
other and approximately 62-65% identity with the 
drosophila and C elegans rdgB's. The full length amino 
acid sequence for c. Elagans is set forth in SEQ ID NO: 7 

10 and the full length Drosophila nucleic acid sequence set 
forth in SEQ ID NO: 8, and the full length Drosophila amino 
acid sequence is set forth in SEQ ID NO : 9 . The PIT 
domains of the rdgBs have a conserved putative ATP binding 
motif similar to that seen in protein kinases. 

15 The second central domain is present in all human 

rdgbs described herein and has no sequence homology to any 
other known domain. The three human rgdbs share 43-47% 
identity over the 600 to 675 amino acid stretch and show 
25-35% identity to the invertebrate rdgB's. This large 

20 domain contains three subdomains with much higher identity 
(66-88% in the human rdgbs and 35-75% with the 
invertebrate rdbgs . ) This high level of conservation, 
especially across such a diverse set of species, suggests 
an important functional role for these stretches. The N- 

25 terminal portion of the central domain is a conserved 
acidic region of 10 to 15 amino acids comprised almost 
exclusively of glutamatic and aspartate residues that may 
function as a calcium binding motif. 

The third rdgB domain is particularly unique to these 

30 proteins and consists of the C-terminal 343 to 384 
residues of the proteins. There is 60-63% identity 
amongst the human rdgbs and 40-60% with the invertebrate 
rdgB's. The comparison with the drosophila rdgb is based 
on the unique knowledge of this domain and its functional 

35 significance as described herein. The published sequence 
contained a framseshift mutation such that the protein was 
previously thought to terminate less than halfway through 
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this domain. By comparison with the human sequences, the 
present invention provides a sequence that extends beyond 
the end of the drosophila sequence to include amino acids 
1054-1249 . 

5 Within the PYK2 binding domain is a distinct motif 

with primary sequence homology to the nucleotide binding 
region of the ras-related GTP-binding proteins. All 
members of this family (ras, rho, rac, rab, ran) contain 
a sequence characterized by the conserved hydrophobic- 

10 hydrophobic -G-X-K-X-D- hydrophobic amino acid sequence. 
The G-X-K motif in the rdgBs is at aa 614 (rdgbl) , aa898 
(rdgb2) , aa 983 (rdgb3) and aa 987 (dm) . Based on 
analysis of the three dimensional structure (by X-ray 
crystalography) of this region from ras and ran, this 

15 motif grasps the nucleotide ring of GDP/GTP as part of the 
molecular "on-off" switch in these proteins. The rdgbs 
however lack the upstream p-llop or A-box present in these 
small G-proteins. 

RdgB proteins are involved in key signal transduction 

20 pathways related to neurotransmitter signaling. This is 
based in part on the recognition of existence and 
significance of domains found in rdgB proteins (see Figure 
1) . For example, the experiments described herein 

demonstrate that rdgB proteins contain a PYK2 binding 

25 domain. PYK2 is believed to be responsible for regulating 
neurotransmitter signaling. The rdgB proteins also 

contain a PIT domain, which in Drosophila is involved in 
PI transfer. PI transfer in humans is involved in the 
recycling of synaptic vesicles. Thus, in view of the 

3 0 roles of the PYK2 binding domain and the PIT domain, rdgB 
proteins may be useful in the treatment of conditions of 
nervous system by enhancing or inhibiting such signaling. 

Thus, in a first aspect the invention features an 
isolated, purified, enriched or recombinant nucleic acid 

35 encoding a rdgB polypeptide. Preferably such nucleic acid 
encodes a mammalian rdgB polypeptide, more preferably it 
encodes a human rdgB polypeptide. 
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By "isolated" in reference to nucleic acid is meant 
a polymer of 2 (preferably 21, more preferably 39, most 
preferably 75) or more nucleotides conjugated to each 
other, including DNA or RNA that is isolated from a 
5 natural source or that is synthesized. The isolated 
nucleic acid of the present invention is unique in the 
sense that it is not found in a pure or separated state in 
nature. Use of the term "isolated" indicates that a 
naturally occurring sequence has been removed from its 

10 normal cellular environment. Thus, the sequence may be in 
a cell-free solution or placed in a different cellular 
environment . The term does not imply that the sequence is 
the only nucleotide chain present, but does indicate that 
it is the predominate sequence present (at least 10 - 20% 

15 more than any other nucleotide sequence) and is 
essentially free (about 90 - 95% pure at least) of 
non-nucleotide material naturally associated with it. 
Therefore, the term does not encompass an isolated 
chromosome encoding one or more rdgB polypeptides. 

20 By the use of the term "enriched" in reference to 

nucleic acid is meant that the specific DNA or RNA 
sequence constitutes a significantly higher fraction (2 - 
5 fold) of the total DNA or RNA present in the cells or 
solution of interest than in normal or diseased cells or 

25 in the cells from which the sequence was taken. This 
could be caused by a person by preferential reduction in 
the amount of other DNA or RNA present, or by a 
preferential increase in the amount of the specific DNA or 
RNA sequence, or by a combination of the two. However, it 

30 should be noted that enriched does not imply that there 
are no other DNA or RNA sequences present, just that the 
relative amount of the sequence of interest has been 
significantly increased in a useful manner and preferably 
separate from a sequence library. The term significant 

35 here is used to indicate that the level of increase is 
useful to the person making such an increase, and 
generally means an increase relative to other nucleic 
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acids of about at least 2 fold, more preferably at least 
5 to 10 fold or even more. The term also does not imply 
that there is no DNA or RNA from other sources. The other 
source DNA may, for example, comprise DNA from a yeast or 
5 bacterial genome, or a cloning vector such as pUC19 . This 
term distinguishes from naturally occurring events, such 
as viral infection, or tumor type growths, in which the 
level of one mRNA may be naturally increased relative to 
other species of mRNA. That is , the term is meant to 
10 cover only those situations in which a person has 
intervened to elevate the proportion of the desired 
nucleic acid. 

It is also advantageous for some purposes that a 
nucleotide sequence be in purified form. The term 

15 "purified" in reference to nucleic acid does not require 
absolute purity (such as a homogeneous preparation) ; 
instead, it represents an indication that the sequence is 
relatively purer than in the natural environment (compared 
to the natural level this level should be at least 2-5 

20 fold greater, e.g., in terms of mg/ml) . Individual clones 
isolated from a cDNA library may be purified to 
electrophoretic homogeneity. The claimed DNA molecules 
obtained from these clones could be obtained directly from 
total DNA or from total RNA. The cDNA clones are not 

25 naturally occurring, but rather are preferably obtained 
via manipulation of a partially purified naturally 
occurring substance (messenger RNA) . The construction of 
a cDNA library from mRNA involves the creation of a 
synthetic substance (cDNA) and pure individual cDNA clones 

30 can be isolated from the synthetic library by clonal 
selection of the cells carrying the cDNA library. Thus, 
the process which includes the construction of a cDNA 
library from mRNA and isolation of distinct cDNA clones 
yields an approximately 10 6 -fold purification of the native 

35 message. Thus, purification of at least one order of 
magnitude, preferably two or three orders, and more 
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preferably four or five orders of magnitude is expressly 
contemplated . 

By "rdgB polypeptide" is meant 9 or more contiguous 
amino acids set forth in the full length amino acid 
5 sequence of SEQ ID NO: 4, SEQ ID NO: 5, or SEQ ID NO: 6. The 
rdgB polypeptides can be encoded by full-length nucleic 
acid sequences or any portion of a full-length nucleic 
acid sequence, so long as a functional activity of the 
polypeptide is retained. Preferred functional activities 

10 include the ability to bind to the N-terminal portion of 
PYK2 . For example, the present invention encompasses 
deletion mutants isolated domains, and complementary 
sequences capable of hybridizing to full length rdgB 
protein under stringent hybridization conditions. 

IB In preferred embodiments, isolated nucleic acid 

comprises, consists essentially of, or consists of a 
nucleic acid sequence set forth in the full length nucleic 
acid sequence SEQ ID NO:l, SEQ ID NO: 2, or SEQ ID NO: 3 or 
at least 27, 30, 45, 60 or 90 contiguous nucleotides 

20 thereof and the rdgB polypeptide comprises, consists 
essentially of, or consists of at least 9, 10, 15, 20, 30, 
50, 100, 200, or 300 contiguous amino acids of a rdgB 
polypeptide . 

By "comprising" it is meant including, but not 
25 limited to, whatever follows the word "comprising". Thus, 
use of the term "comprising" indicates that the listed 
elements are required or mandatory, but that other ele- 
ments are optional and may or may not be present. By 
"consisting of" is meant including, and limited to, 
30 whatever follows the phrase "consisting of". Thus, the 
phrase "consisting of" indicates that the listed elements 
are required or mandatory, and that no other elements may 
be present. By "consisting essentially of" is meant 
including any elements listed after the phrase, and 
35 limited to other elements that do not interfere with or 
contribute to the activity or action specified in the 
disclosure for the listed elements. Thus, the phrase 
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"consisting essentially of" indicates that the listed 
elements are required or mandatory, but that other ele- 
ments are optional and may or may not be present depending 
upon whether or not they affect the activity or action of 
5 the listed elements. 

Compositions and probes of the present invention may 
contain human nucleic acids encoding a rdgB polypeptide 
but are substantially free of nucleic acid not encoding 
rdgB polypeptide. The human nucleic acid encoding a rdgB 
10 polypeptide is at least 18 contiguous bases of the 
nucleotide sequence set forth in SEQ . ID NO. 1, SEQ . ID 
NO. 2, or SEQ. ID NO. 3 and will selectively hybridize to 
human genomic DNA encoding a rdgB polypeptide, or is 
complementary to such a sequence. The nucleic acid may be 
15 isolated from a natural source by cDNA cloning or 
subtract ive hybridization; the natural source may be 
blood, semen, and tissue of various organisms including 
eukaryotes, mammals, birds, fish, plants, gorillas, rhesus 
monkeys, chimpanzees and humans; and the nucleic acid may 
20 be synthesized by the triester method or by using an 
automated DNA synthesizer. In yet other preferred 

embodiments the nucleic acid is a conserved or unique 
region, for example those useful for the design of 
hybridization probes to facilitate identification and 
25 cloning of additional polypeptides, the design of PCR 
probes to facilitate cloning of additional polypeptides, 
and obtaining antibodies to polypeptide regions. 

By "conserved nucleic acid regions", are meant 
regions present on two or more nucleic acids encoding a 
30 rdgB polypeptide, to which a particular nucleic acid 
sequence can hybridize to under lower stringency condi- 
tions. Examples of lower stringency conditions suitable 
for screening for nucleic acid encoding rdgB polypeptides 
are provided in Abe, et al . J. Biol . Chem. , 19:13361 
35 (1992) (hereby incorporated by reference herein in its 
entirety, including any drawings) . Preferably, conserved 
regions differ by no more than 7 out of 20 nucleotides. 
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By "unique nucleic acid region" is meant a sequence 
present in a full length nucleic acid coding for a rdgB 
polypeptide that is not present in a sequence coding for 
any other naturally occurring polypeptide. Such regions 
5 preferably comprise 12 or 20 contiguous nucleotides 
present in the full length nucleic acid encoding a rdgB 
polypeptide . 

The invention also features a nucleic acid probe for 
the detection of a rdgB polypeptide or nucleic acid 

10 encoding a rdgB polypeptide in a sample. The nucleic acid 
probe contains nucleic acid that will hybridize to at 
least one sequence set forth in SEQ ID NO : 1 , SEQ ID NO: 2, 
or SEQ ID NO : 3 . 

In preferred embodiments the nucleic acid probe 

15 hybridizes to nucleic acid encoding at least 12, 27, 30, 
35, 40, 50, 100, 200, or 300 contiguous amino acids of the 
full-length sequence set forth in SEQ ID NO: 4, SEQ ID 
NO: 5, or SEQ ID NO : 6 . Various low or high stringency 
hybridization conditions may be used depending upon the 

20 specificity and selectivity desired. 

By "high stringency hybridization conditions" is 
meant those hybridizing conditions that (1) employ low 
ionic strength and high temperature for washing, for 
example, 0.015 M NaCl/0.0015 M sodium citrate/0.1% SDS at 

25 50°C; (2) employ during hybridization a denaturing agent 
such as formamide, for example, 50% (vol/vol) formamide 
with 0.1% bovine serum albumin/0.1% Ficoll/0.1% 
polyvinylpyrrolidone/50 mM sodium phosphate buffer at 
pH 6.5 with 750 mM NaCl , 75 mM sodium citrate at 42°C; or 

30 (3) employ 50% formamide, 5 x SSC (0.75 M NaCl, 0.075 M 
Sodium pyrophosphate, 5 x Denhardt's solution, sonicated 
salmon sperm DNA (50 g/ml), 0.1% SDS, and 10% dextran 
sulfate at 42°C, with washes at 42°C in 0 . 2 x SSC and 0.1% 
SDS. Under stringent hybridization conditions only highly 

35 complementary nucleic acid sequences hybridize. 
Preferably, such conditions prevent hybridization of 
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nucleic acids having 1 or 2 mismatches out of 20 
contiguous nucleotides . 

Methods for using the probes include detecting the 
presence or amount of rdgB RNA in a sample by contacting 
5 the sample with a nucleic acid probe under conditions such 
that hybridization occurs and detecting the presence or 
amount of the probe bound to rdgB RNA. The nucleic acid 
duplex formed between the probe and a nucleic acid 
sequence coding for a rdgB polypeptide may be used in the 

10 identification of the sequence of the nucleic acid 
detected (for example see, Nelson et al . , in Nonisotopic 
DNA Probe Techniques , p. 275 Academic Press, San Diego 
(Kricka, ed., 1992) hereby incorporated by reference 
herein in its entirety, including any drawings) . Kits for 

15 performing such methods may be constructed to include a 
container means having disposed therein a nucleic acid 
probe . 

The invention also features recombinant nucleic acid, 
preferably in a cell or an organism. The recombinant 

20 nucleic acid may contain a sequence set forth in SEQ ID 
NO : 1 and a vector or a promoter effective to initiate 
transcription in a host cell. The recombinant nucleic 
acid can alternatively contain a transcriptional 
initiation region functional in a cell, a sequence 

2 5 complimentary to an RNA sequence encoding a rdgB 
polypeptide and a transcriptional termination region 
functional in a cell. 

In another aspect the invention features an isolated, 
enriched or purified rdgB polypeptide. 

30 By "isolated" in reference to a polypeptide is meant 

a polymer of 2 (preferably 7, more preferably 13, most 
preferably 25) or more amino acids conjugated to each 
other, including polypeptides that are isolated from a 
natural source or that are synthesized. The isolated 

35 polypeptides of the present invention are unique in the 
sense that they are not found in a pure or separated state 
in nature. Use of the term "isolated" indicates that a 
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naturally occurring sequence has been removed from its 
normal cellular environment. Thus, the sequence may be in 
a cell-free solution or placed in a different cellular 
environment. The term does not imply that the sequence is 
5 the only amino acid chain present, but that it is the 
predominate sequence present (at least 10 - 20% more than 
any other sequence) and is essentially free (about 90 - 
95% pure at least) of non-amino acid material naturally 
associated with it. 

10 By the use of the term "enriched" in reference to a 

polypeptide is meant that the specific amino acid sequence 
constitutes a significantly higher fraction (2-5 fold) 
of the total of amino acids present in the cells or 
solution of interest than in normal or diseased cells or 

15 in the cells from which the sequence was taken. This 
could be caused by a person by preferential reduction in 
the amount of other amino acids present, or by a 
preferential increase in the amount of the specific amino 
acid sequence of interest, or by a combination of the two. 

20 However, it should be noted that enriched does not imply 
that there are no other amino acid sequences present, just 
that the relative amount of the sequence of interest has 
been significantly increased. The term significant here 
is used to indicate that the level of increase is useful 

25 to the person making such an increase, and generally means 
an increase relative to other amino acids of about at 
least 2 fold, more preferably at least 5 to 10 fold or 
even more. The term also does not imply that there is no 
amino acid from other sources . The other source amino 

30 acid may, for example, comprise amino acid encoded by a 
yeast or bacterial genome, or a cloning vector such as 
pUC19. -The term is meant to cover only those situations 
in which man has intervened to elevate the proportion of 
the desired amino acid. 

35 It is also advantageous for some purposes that an 

amino acid sequence be in purified form. The term 
"purified" in reference to a polypeptide does not require 
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absolute purity (such as a homogeneous preparation) ; 
instead, it represents an indication that the sequence is 
relatively purer than in the natural environment (compared 
to the natural level this level should be at least 2-5 
5 fold greater, e.g., in terms of mg/ml) . Purification of 
at least one order of magnitude, preferably two or three 
orders, and more preferably four or five orders of 
magnitude is expressly contemplated. The substance is 
preferably free of contamination at a functionally 

10 significant level, for example 90%, 95%, or 99% pure. 

In preferred embodiments rdgB polypeptides contain at 
least 9, 10, 15, 20, or 30 contiguous amino acids of the 
full-length sequence set forth in SEQ ID NO : 4 , SEQ ID 
NO: 5, or SEQ ID NO : 6 . 

15 In yet another aspect the invention features a 

purified antibody ( e.g. , a monoclonal or polyclonal 
antibody) having specific binding affinity to a rdgB 
polypeptide. The antibody contains a sequence of amino 
acids that is able to specifically bind to a rdgB 

20 polypeptide. 

By "specific binding affinity" is meant that the 
antibody will bind to a hrgdB polypeptide at a certain 
detectable amount but will not bind other polypeptides to 
the same extent, under identical conditions. The present 

25 invention also encompasses antibodies that can distinguish 
hrgdBl from hrdgB2 or hrdgB3 or can otherwise distinguish 
between the various rdgBs . 

Antibodies having specific binding affinity to a rdgB 
polypeptide may be used in methods for detecting the 

30 presence and/or amount of a rdgB polypeptide is a sample 
by contacting the sample with the antibody under 
conditions such that an immunocomplex forms and detecting 
the presence and/or amount of the antibody conjugated to 
the rdgB polypeptide. Diagnostic kits for performing such 

35 methods may be constructed to include a first container 
means containing the antibody and a second container means 
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having a conjugate of a binding partner of the antibody 
and a label. 

In another aspect the invention features a hybridoma 
which produces an antibody having specific binding 
5 affinity to a rdgB polypeptide. 

By "hybridoma" is meant an immortalized cell line 
which is capable of secreting an antibody, for example a 
rdgB ant ibody . 

In preferred embodiments the rdgB antibody comprises 
10 a sequence of amino acids that is able to specifically 
bind a rdgB polypeptide. 

Another aspect of the invention features a method of 
detecting the presence or amount of a compound capable of 
binding to a rdgB polypeptide. The method involves 
15 incubating the compound with a rdgB polypeptide and 
detecting the presence or amount of the compound bound to 
the rdgB polypeptide. 

In preferred embodiments, the compound inhibits an 
activity of rdgB. The present invention also features 
2 0 compounds capable of binding and inhibiting rdgB 
polypeptide that are identified by methods described 
above . 

In another aspect the invention features a method of 
screening potential agents useful for treatment of a 

25 disease or condition characterized by an abnormality in a 
signal transduction pathway that contains an interaction 
between a rdgB polypeptide and a natural binding partner 
(NBP) . The method involves assaying potential agents for 
those able to promote or disrupt the interaction as an 

30 indication of a useful agent. 

By "screening" is meant investigating an organism for 
the presence or absence of a property. The process may 
include measuring or detecting various properties, 
including the level of signal transduction and the level 

35 of interaction between a rdgB polypeptide and a NBP. 

By "disease or condition" is meant a state in an 
organism, e.g. , a human, which is recognized as abnormal 
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by members of the medical community. The disease or 
condition may be characterized by an abnormality in one or 
more signal transduction pathways in a cell, preferably a 
cell listed in table 1, wherein one of the components of 
5 the signal transduction pathway is either a rdgB 
polypeptide or a NBP . 

Specific diseases or disorders which might be treated 
or prevented, based upon the affected cells include: 
myasthenia gravis; neuroblastoma; disorders caused by 

10 neuronal toxins such as cholera toxin, pertusis toxin, or 
snake venom; acute megakaryocytic myelosis; thrombo- 
cytopenia; those of the central nervous system such as 
seizures, stroke, head trauma, spinal cord injury, 
hypoxia- induced nerve cell damage such as in cardiac 

15 arrest or neonatal distress, epilepsy, neurodegenerative 
diseases such as Alzheimer's disease, Huntington's disease 
and Parkinson's disease, dementia, muscle tension, 
depression, anxiety, panic disorder, obsessive -compulsive 
disorder, post - traumatic stress disorder, schizophrenia, 

20 neuroleptic malignant syndrome , and Tourette's syndrome. 
Conditions that may be treated by rdgB inhibitors include 
epilepsy, schizophrenia, extreme hyperactivity in 
children, chronic pain, and acute pain. Examples of 
conditions that may be treated by PYK2-rdgB pathway 

25 enhancers (for example a phosphatase inhibitor) include 
stroke, Alzheimer's, Parkinson's, other neurodegenerative 
diseases and migraine . 

Preferred disorders include epilepsy, stroke, 
schizophrenia, and Parkinson's disorder as there is an 

30 established relationship between these disorders and the 
function of potassium channels. See, McLean et al . , 
Epilepsia 35.-S5-S9 1994 ; Ricard-Mousnier et al . , 
Neurophvsiolocrie Clinigue 23:395-421, 1993; Crit Rev. 
Veurobiol 7:187-203, 1994; Simon and Lin, Biophvs. J. 

35 64:A100, 1993; Birnstiel et al . , Synapse (NY) 11:191-196, 
1992; Coleman et al . , Brain Res . 575:138-142 1992; Popolip 
et al . , Br. J. Pharmacol 104:907-913, 1991; Murphy et al . , 
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Exp. Brain Res. 84:355-358, 1991; Rutecki et al . , 
Epilepsia 32:1-2, 1991; Fisher and Coyle (ed) , Frontiers 
of Clinical Neurosciene , Vol. 11 "Neurotransmitters and 
Epilepsy"; Meeting, Woods Hole MA, USA IX+260P. John Wiley 
5 and Sons, Inc. NY, NY; Treherne and Ashford, Neuroscience 
40:523-532, 1991; Gehlert , Prog . Neuro - Psvchopharmacol . 
Biol. Psychiatry 18:1093-1102, 1994; Baudy, Expert Qpin 
Ther. Pat. 1994 4/4:343-378; Porter and Rogawski, 
Epilepsia 33:S1-S6, 1992; Murphy, J. Physiol. 453:167-183, 

10 1992; Cromakalim, Drugs Future 17/3:237-239, 1992; 
Carmeliet, Eur. Heart J. 12:30-37, 1991; Olpe et al . , 
Experientia 47/3:254-257, 1991; Andrade et al. , Science 
234/4781:1261-1265, 1986; Forster, J. Neurosci . Methods 
13/3-4:199-212, 1985. 

15 In preferred embodiments, the methods described 

herein involve identifying a patient in need of treatment. 
Those skilled in the art will recognize that various 
techniques may be used to identify such patients. For 
example, cellular potassium levels may be measured or the 

20 individuals genes may be examined for a defect . 

By "abnormality" is meant an a level which is 
statistically different from the level observed in organ- 
isms not suffering from such a disease or condition and 
may be characterized as either an excess amount, intensity 

25 or duration of signal or a deficient amount, intensity or 
duration of signal. The abnormality in signal 

transduction may be realized as an abnormality in cell 
function, viability or differentiation state. The present 
invention is based in part on the determination that such 

30 abnormality in a pathway can be alleviated by action at 
the PYK2-rdgB interaction site in the pathway. An 
abnormal interaction level may also either be greater or 
less than the normal level and may impair the normal 
performance or function of the organism. Thus, it is also 

35 possible to screen for agents that will be useful for 
treating a disease or condition, characterized by an 
abnormality in the signal transduction pathway, by testing 
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compounds for their ability to affect the interaction 
between a rdgB polypeptide and PYK2 , since the complex 
formed by such interaction is part of the signal transduc- 
tion pathway. However, the disease or condition may be 
5 characterized by an abnormality in the signal transduction 
pathway even if the level of interaction between the rdgB 
polypeptide and NBP is normal. 

By "interact" is meant any physical association 
between polypeptides, whether covalent or non-covalent . 

10 This linkage can include many chemical mechanisms, for 
instance covalent binding, affinity binding, 

intercalation, coordinate binding and complexat ion . 
Examples of non-covalent bonds include electrostatic 
bonds, hydrogen bonds, and Van der Waals bonds. 

15 Furthermore, the interactions between polypeptides may 
either be direct or indirect. Thus, the association 
between two given polypeptides may be achieved with an 
intermediary agent, or several such agents, that connects 
the two proteins of interest ( e.g. , a rdgB polypeptide and 

20 PYK2) . Another example of an indirect interaction is the 
independent production, stimulation, or inhibition of both 
a rdgB polypeptide and PYK2 by a regulatory agent. 
Depending upon the type of interaction present, various 
methods may be used to measure the level of interaction. 

25 For example, the strengths of covalent bonds are often 
measured in terms of the energy required to break a 
certain number of bonds (i.e., kcal/mol) Non-covalent 
interactions are often described as above, and also in 
terms of the distance between the interacting molecules. 

30 Indirect interactions may be described in a number of 
ways, including the number of intermediary agents 
involved, or the degree of control exercised over the rdgB 
polypeptide relative to the control exercised over PYK2 or 
another NBP. 

35 By "disrupt" is meant that the interaction between 

the rdgB polypeptide and PYK2 or a NBP is reduced either 
by preventing expression of the rdgB polypeptide, or by 
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preventing expression of PYK2 or NBP, or by specifically- 
preventing interaction of the naturally synthesized 
proteins or by interfering with the interaction of the 
proteins . 

5 By "promote" is meant that the interaction between a 

rdgB polypeptide and PYK2 or NBP is increased either by 
increasing expression of a rdgB polypeptide, or by 
increasing expression of PYK2 or a NBP, or by decreasing 
the dephosphorylat ing activity of the corresponding 

10 regulatory PTP (or other phosphatase acting on other 
phosphorylated signaling components) by promoting 
interaction of the rdgB polypeptide and PYK2 or NBP or by 
prolonging the duration of the interaction. Covalent 
binding can be promoted either by direct condensation of 

15 existing side chains or by the incorporation of external 
bridging molecules. Many bivalent or polyvalent linking 
agents are useful in coupling polypeptides, such as an 
antibody, to other molecules. For example, representative 
coupling agents can include organic compounds such as 

20 thioesters, carbodiimides , succinimide esters, 

diisocyanates , glutaraldehydes , diazobenzenes and 
hexamethylene diamines. This listing is not intended to 
be exhaustive of the various classes of coupling agents 
known in the art but, rather, is exemplary of the more 

25 common coupling agents. ( See Killen and Lindstrom 1984, 
J. Immunol. 133.: 1335-2549; Jansen, F.K., et al . , 1982, 
Immunological Rev. £2:185-216/ and Vitetta et al . , supra ) . 

By " NBP " is meant a natural binding partner of a rdgB 
polypeptide that naturally associates with a rdgB 

30 polypeptide. The structure (primary, secondary, or 
tertiary) of the particular natural binding partner will 
influence the particular type of interaction between the 
rdgB polypeptide and the natural binding partner. For 
example, if the natural binding partner comprises a 

35 sequence of amino acids complementary to the rdgB 
polypeptide, covalent bonding may be a possible 
interaction. Similarly, other structural characteristics 
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may allow for other corresponding interactions. The 
interaction is not limited to particular residues and 
specifically may involve phosphotyrosine , phosphoserine, 
or phosphothreonine residues. A broad range of sequences 
5 may be capable of interacting with rdgB polypeptides. One 
example of a natural binding partner may be pyk2 , which is 
described above. Using techniques well known in the art, 
one may identify several natural binding partners for rdgB 
polypeptides such as by utilizing a two-hybrid screen. 

10 By "signal transduction pathway " is meant the 

sequence of events that involves the transmission of a 
message from an extracellular protein to the cytoplasm 
through a cell membrane. The signal ultimately will cause 
the cell to perform a particular function, for example, to 

15 uncontrollably proliferate and therefore cause cancer. 
Various mechanisms for the signal transduction pathway 
(Fry et al . , Protein Science f 2:1785-1797, 1993) provide 
possible methods for measuring the amount or intensity of 
a given signal. Depending upon the particular disease 

20 associated with the abnormality in a signal transduction 
pathway, various symptoms may be detected. Those skilled 
in the art recognize those symptoms that are associated 
with the various other diseases described herein. 
Furthermore, since some adapter molecules recruit 

2 5 secondary signal transducer proteins towards the membrane, 

one measure of signal transduction is the concentration 
and localization of various proteins and complexes. In 
addition, conformational changes that are involved in the 
transmission of a signal may be observed using circular 
30 dichroism and fluorescence studies. 

In another aspect the invention features a method of 
diagnosis of an organism for a disease or condition 
characterized by an abnormality in a signal transduction 
pathway that contains an interaction between a rdgB 

3 5 polypeptide and PYK2 or a NBP. The method involves 

detecting the level of interaction as an indication of 
said disease or condition. 
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By "organism" is meant any living creature. The term 
includes mammals, and specifically humans. Preferred 
organisms include mice, as the ability to treat or 
diagnose mice is often predictive of the ability to func- 
5 tion in other organisms such as humans. 

By "diagnosis" is meant any method of identifying a 
symptom normally associated with a given disease or 
condition. Thus, an initial diagnosis may be conclusively 
established as correct by the use of additional 

10 confirmatory evidence such as the presence of other 
symptoms. Current classification of various diseases and 
conditions is constantly changing as more is learned about 
the mechanisms causing the diseases or conditions. Thus, 
the detection of an important symptom, such as the 

15 detection of an abnormal level of interaction between rdgB 
polypeptides and PYK2 or NBPs may form the basis to define 
and diagnose a newly named disease or condition. For 
example, conventional cancers are classified according to 
the presence of a particular set of symptoms. However, a 

20 subset of these symptoms may both be associated with an 
abnormality in a particular signalling pathway, such as 
the ras 21 pathway and in the future these diseases may be 
reclassified as ras 21 pathway diseases regardless of the 
particular symptoms observed. 

25 Yet another aspect of the invention features a method 

for treatment of an organism having a disease or condition 
characterized by an abnormality in a signal transduction 
pathway. The signal transduction pathway contains an 
interaction between a rdgB polypeptide and PYK2 or a NBP 

3 0 and the method involves promoting or disrupting the 
interaction, including methods that target the rdgB: NBP 
interaction directly, as well as methods that target other 
points along the pathway. 

By "dominant negative mutant protein" is meant a 

35 mutant protein that interferes with the normal signal 
transduction pathway. The dominant negative mutant 
protein contains the domain of interest (e.g., an rdgB 
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polypeptide or PYK2 or a NBP) , but has a mutation 
preventing proper signaling, for example by preventing 
binding of a second domain from the same protein. One 
example of a dominant negative protein is described in 
5 Millauer et al . , Nature February 10, 1994. The agent is 
preferably a peptide which blocks or promotes interaction 
of the rdgB polypeptide and PYK2 or another NBP. The 
peptide may be recombinant, purified, or placed in a 
pharmaceut ically acceptable carrier or diluent . 

10 An EC 50 or IC 50 of less than or equal to 100 fiM is 

preferable, and even more preferably less than or equal to 
50 fiM, and most preferably less that or equal to 20 pM. 
Such lower EC 50 's or IC 50 's are advantageous since they 
allow lower concentrations of molecules to be used in vivo 

15 or in vitro for therapy or diagnosis. The discovery of 
molecules with such low EC 50 's and IC 50 's enables the design 
and synthesis of additional molecules having similar 
potency and effectiveness. In addition, the molecule may 
have an EC 50 or IC 50 less than or equal to 100 /jlM at one or 

20 more, but not all cells chosen from the group consisting 
of parathyroid cell, bone osteoclast, juxtaglomerular 
kidney cell, proximal tubule kidney cell, distal tubule 
kidney cell, cell of the thick ascending limb of Henle's 
loop and/or collecting duct, central nervous system cell, 

25 keratinocyte in the epidermis, parafollicular cell in the 
thyroid (C-cell) , intestinal cell, trophoblast in the 
placenta, platelet, vascular smooth muscle cell, cardiac 
atrial cell, gastrin-secreting cell, glucagon-secret ing 
cell, kidney mesangial cell, mammary cell, beta cell, 

30 fat/adipose cell, immune cell and GI tract cell. 

By "therapeutically effective amount" is meant an 
amount of a pharmaceutical composition having a 
therapeutically relevant effect. A therapeutically 

relevant effect relieves to some extent one or more 

35 symptoms of the disease or condition in the patient; or 
returns to normal either partially or completely one or 
more physiological or biochemical parameters associated 
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with or causative of the disease or condition. Generally, 
a therapeutically effective amount is between about 1 
nmole and 1 /imole of the molecule, depending on its EC 50 or 
IC S0 and on the age and size of the patient, and the 
5 disease associated with the patient. 

In another aspect, the invention describes a 
polypeptide comprising a recombinant rdgB polypeptide or 
a unique fragment thereof. By "unique fragment, " is meant 
an amino acid sequence present in a full-length rdgB 

10 polypeptide that is not present in any other naturally 
occurring polypeptide. Preferably, such a sequence 

comprises 6 contiguous amino acids present in the full 
sequence. More preferably, such a sequence comprises 12 
contiguous amino acids present in the full sequence. Even 

15 more preferably, such a sequence comprises 18 contiguous 
amino acids present in the full sequence. 

By "recombinant rdgB polypeptide" is meant to include 
a polypeptide produced by recombinant DNA techniques such 
that it is distinct from a naturally occurring polypeptide 

20 either in its location ( e.g. , present in a different cell 
or tissue than found in nature), purity or structure. 
Generally, such a recombinant polypeptide will be present 
in a cell in an amount different from that normally 
observed in nature . 

25 In another aspect, the invention describes a 

recombinant cell or tissue containing a purified nucleic 
acid coding for a rdgB polypeptide. In such cells, the 
nucleic acid may be under the control of its genomic 
regulatory elements, or may be under the control of 

3 0 exogenous regulatory elements including an exogenous 
promoter. By "exogenous" it is meant a promoter that is 
not normally coupled in vivo transcriptionally to the 
coding sequence for the rdgB polypeptide. 

In another aspect, the invention features a rdgB 

3 5 polypeptide binding agent able to bind to a rdgB 
polypeptide. The binding agent is preferably a purified 
antibody which recognizes an epitope present on a rdgB 
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polypeptide. Other binding agents include molecules which 
bind to the rdgB polypeptide and analogous molecules which 
bind to a rdgB polypeptide. 

By "purified" in reference to an antibody is meant 
5 that the antibody is distinct from naturally occurring 
antibody, such as in a purified form. Preferably, the 
antibody is provided as a homogeneous preparation by 
standard techniques. Uses of antibodies to the cloned 
polypeptide include those to be used as therapeutics, or 

10 as diagnostic tools. 

In another aspect, the invention provides a nucleic 
acid molecule comprising a nucleotide sequence that 
encodes: (a) a polypeptide having an amino acid sequence 
set forth in SEQ ID NO : 4 from amino acid residues 1-616 or 

15 616-974; (b) the complement of the nucleotide sequence of 
(a) ; (c) a polypeptide having an amino acid sequence set 
forth in SEQ ID NO : 5 from amino acid residues 1-250, 250- 
900, or 900-1243; (d) the complement of the nucleotide 
sequence of (c) ; (e) a polypeptide having an amino acid 

20 sequence of SEQ ID NO : 6 from amino acid residues 1-251, 
251-985, or 985-1349; or (f) the complement of the 
nucleotide sequence of (e) . The utility of such isolated 
domains in the design of protein inhibitors is well-known 
to those skilled in the art. 

25 The invention also provides an isolated nucleic acid 

molecule comprising a nucleotide sequence that encodes a 
polypeptide having the full length amino acid sequence set 
forth in SEQ ID NO : 4 ; SEQ ID NO: 5, or SEQ ID NO : 6 except 
that it lacks at least one, but not more than two, of the 

30 domains selected from the group consisting of the PIT, the 
central domain, the PYK2 binding domain, the calcium 
binding domain and the nucleotide binding domain. Such 
deletion mutants are useful in the design of assays for 
protein inhibitors. The nucleic acid molecules described 

3 5 above may be, for example, cDNA or genomic DNA and may be 
placed in a recombinant vector or expression vector. In 
such a vector, the nucleic acid preferably is operatively 
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associated with the regulatory nucleotide sequence 
containing transcriptional and translational regulatory- 
information that controls expression of the nucleotide 
sequence in a host cell. 
5 Thus, the invention also provides a genetically 

engineered host cell containing any of the nucleotide 
sequences described herein and the nucleic acid preferably 
is operatively associated with the regulatory nucleotide 
sequence containing transcriptional and translational 

10 regulatory information that controls expression of the 
nucleotide sequence in a host cell. Such host cells may 
obviously be either prokaryotic or eukaryotic. 

Other features and advantages of the invention will 
be apparent from the following description of the 

15 preferred embodiments thereof, and from the claims. 

Brief Description of the Figures 

Figure 1 shows the domains of some preferred full 
length rdgB proteins. 

Description of the Preferred Embodiments 

20 The present invention relates to rdgB polypeptides, 

nucleic acids encoding such polypeptides, cells, tissues 
and animals containing such nucleic acids, antibodies to 
such polypeptides, assays utilizing such polypeptides, and 
methods relating to all of the foregoing. Those skilled 

25 in the art will recognize that many of the methods 
described below in relation to rdgB, PYK-2, a NBP, or a 
complex of rdgB with PYK-2 or a NBP could also be utilized 
with respect to the other members of this group. 

We describe the isolation and characterization of a 

30 novel non-receptor tyrosine kinase binding protein, termed 
rdgB. HrdgBl is expressed in the brain, spleen, and 
ovary. HrdgB2 is expressed in many human tissues 

including brain, heart, thymus, and peripheral blood 
leukocytes. HrdgB3 is highly expressed in the thymus but 

35 is also expressed in the brain, heart, ovary, and testis. 
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The examples presented for PYK2 , supra, reveal a 
novel mechanism for the coupling, between G-protein 
coupled receptors and the MAP kinase signaling pathway. 
These examples also showed that calcium influx induced by 
5 membrane depolorizat ion following activation of the 
nicotinic acetylcholine receptor or other stimuli that 
cause calcium influx or release from internal stores lead 
to the activation of PYK2 , tyrosine phosphorylation of 
She, recruitment of Grb2/Sos and activation of the MAP 
10 kinase signaling pathway. Pyk2 can also link 

extracellular signals with the JNK/SAP kinase signaling 
pathway . 

RdgB proteins represent a link in the observations 
discolosed above. RdgB proteins are shown to bind to PYK2 

15 with high affinity both in vitro and in vivo. Evidence of 
this high affinity interaction is visualized in 
experiments pulling PYK2 out of a cell lysate with 
glutathione S- transferase fused rdgB proteins. These 
experiments are described in the Examples section below. 

20 In addition the Drosphila homologs of the rdgB proteins 
contain a phosphitidylinositol trasferase domain as well 
as a Ca2-h binding domain. Although the phosphitidyl 
inositol transferase domain is missing in an alternatively 
spliced variant, all forms of rdgB proteins contain a Ca2+ 

2 5 binding domain. Thus the Ca2+ binding domain of rdgB 

proteins are potentially involved in the Ca2 + response 
observed in PYK2 signaling. 

The model presented herein may represent the 
mechanism underlying calcium mediated regulation of gene 

3 0 expression in neuronal cells induced by MMDA receptor or 

voltage sensitive calcium channels. The expression 

pattern of PYK2 , the external stimuli that activate the 
kinase together with its role in the control of MAP kinase 
and JNK signaling pathways suggests a potential role for 
35 PYK2 and rdgB proteins in the control of a broad array of 
processes in the central nervous system including neuronal 
plasticity, highly localized control of ion channel 
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function, as well as, localized activation of the MAP 
kinase and JNK signaling pathways, cell excitability, and 
synaptic efficacy . 

Various other features and aspects of the invention 
5 are: Nucleic Acid Encoding A rdgB Polypeptide; A Nucleic 
Acid Probe for the Detection of RdgB; Probe Based Method 
And Kit For Detecting RdgB; DNA Constructs Comprising a 
RdgB Nucleic Acid Molecule and Cells Containing These 
Constructs; Purified rdgB Polypeptides; RdgB Antibody And 

10 Hybridoma; An Antibody Based Method And Kit For Detecting 
RdgB; Isolation of Compounds Which Interact With RdgB; 
Compositions; Disruption of Protein Complexes; Antibodies 
to Complexes; Pharmaceutical Formulations and Modes of 
Administration; Identification of Agents; Purification and 

15 Production of Complexes; Derivatives of Complexes; and 
Evaluation of Disorders. All of these aspects and 
features are explained in detail with respect to PYK-2 in 
PCT publication WO 96/18738, which is incorporated herein 
by reference in its entirety, including any drawings. 

20 Those skilled in the art will readily appreciate that such 
description can be easily adapted to rgdB as well, and is 
equally applicable to the present invention. 

Examples 

The examples below are non- limiting and are merely 
25 representative of various aspects and features of the 
procedures used to identify the full-length nucleic and 
amino acid sequences of a series of rdgB proteins. 
Experiments demonstrating rdgB expression, interaction and 
signalling activities are also provided. 

3 0 Material and Methods 
Two hybrid screen 

The yeast strain L40 containing the reporter genes 
HIS3 and j3-gal under control of upstream LexA-binding 
site, was used as a host for the two-hybrid screening. 

35 PYK2-N terminal domain (aa 2-245) , PYKN-AI (aa 2-237) , 
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PYK-NN (aa 2-285) and Fak (aa 2-412) N-terminal domain (aa 
2-412) were fused in frame to LexA DNA binding domain. 
Yeast strain that express the LexA-PYKN fusion protein was 
transfected with human brain cDNA library (Clontech 
5' #HL404AB) fused to GAL 4 transcriptional activation domain. 
Transf ormants were plated on agar selection medium lacking 
Uracil (Ura-), Tryptophane (Trp-), Leucine (Leu-) and 
Histidine (His-). Resulting colonies were isolated and 
retested for growth on -Ura-Trp-Leu-His plates and for (3- 
10 galactosidase activity. Plasmid DNA was purified from 
colonies that were His+, /3-gal + and used for 
retransf ormation of yeast strains expressing heterologous 
baits to determine the specificity of the interaction. 

Isolation of rdgBs cDNAs 

15 hrdgBl: Human brain, Substania nigra cDNA library 

(XgtlO , Clontoch HL1179a.) was screened with 3 2 -p- labelled 
probe derived from the yeast prey plasmid encoding GAL 10- 
rdgBl . Four independent clones were isolated, subcloned 
and analyzed by sequence. Sequence analysis indicated 

20 that the 5' end of the gene is missing from our clones. 
Therefore human fetal brain cDNA library (Xgtll, clontech 
HL3003b) was screened with probe derived from the most 5' 
region of our new cDNA contig. Sequence analysis of six 
independent clones that were isolated indicated that all 

25 of them are belong to the same gene, hrdgBl, but they are 
missing the 5' end of the gene. A specific-primed cDNA 
library was constructed in XZapII utilizing human fetal 
brain Poly (A) + RNA as templet for our cDNA synthcasis 
(Stratagene Kit). 15 independent clones were isolated and 

30 allowed subsequently isolation of the full length cDNA of 
hrdgBl . 

hrdgB2 and hrdgB3 : A DNA fragment derived from an 
EST fragment (T12574) was amplified by PCR from human 
fetal brain cDNA. The PCR product was subcloned, 

35 sequenced and used as a probe for screening a human fetal 
brain cDNA library (Xgtll, Clontech H15015b) . One 
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positive clone was obtained from this screen. Sequence 
analysis indicated that it is a partial cDNA clone of a 
novel gene belongs to the human rdgB family. The cDNA 
insert of this clone (1.8kb) was used as a probe for 
5 rescreening the same cDNA library. Seven independent 
clones were obtained, subcloned and sequenced. Sequence 
analysis indicated that all of them belong to the same 
gene; hrdgB2, but they are different from the original 
clone that was isolated from the same library. The 3' end 
10 of our first clone (1.8kb), was used as a probe to screen 
a human heart cDNA library (Clontech 7759-1, 7760-1) and 
allowed subsequent isolation two alternative spliced 
isof orms of hrdgB3 . 

Northern blot 

15 Human multiple tissues Northern blots (Clontech 

HL11296) were hybridized under high-stringency conditions 
using 32P-labelled cDNA fragment of hrdgBl (EcoRI-Eco47III 
nuc# 245-511, hrdgB2 (SacI-Eco47III nucft 1540-2661) and 
hrdgB3 Bst-Xl nuc# 912-1472 as probe according to the 

20 instructions of the manufacture. 

Plasmid Cons t rue t s - Two - hybr id constructs: 

Fusion with LexA DNA-binding domain : PGR was used to 
amplified different regions of PYK2 and Fak cDNAs as 
indicated, the amplified DNA fragments were subcloned into 
pBTM116 in frame to generate a fusion protein with LexA 
DNA-binding domain . 

Fusion with GAL 4 activation domain: PCR was used to 
amplified different regions of hrdgBl, hrdgB2 or hrdgB3 
cDNAs as indicated, the amplified DNA fragments were 
subcloned into pGADIO (Clontech) in frame to generate a 
fusion protein with GAL4 activation domain. 

Expression vectors 

The full length cDNAs of hrdgBl, hrdgB2 and hrdgB3 
were subcloned into pCMPl downstream to CMV promoter. An 
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HA-epitope tag { YP YDVPDYAS ) SEQ ID NO: 10 was fused in 
frame to their carboxy terminal ends. The PYK2 binding 
domain of hrdgB2 (residues 911-1243) was subcloned into 
pCMV-NEO which encode an initiator methionine codon 
5 followed by a Myc epitope tag (EQKLISEEDL) SEQ ID NO : 1 
immediately upstream to the cloning site. 

Antibodies 

Antibodies against rdgBl were raised in rabbit 
immunized either with a synthetic peptide corresponding to 
amino-acids 965-974 of hrdgBl (C-Ter Ab) , or with a GST- 
fusion protein containing residues 231-374 (N-Ter Ab) 
Antibodies against hrdgB2 were raised in rabbit immunized 
with a synthetic peptide corresponding to amino acids 152- 
163 of hrdgB2 . Antibodies against hrdgB3 were raised in 
rabbit against MBP-fusion protein containing residues 7- 
116 of hrdgB3 . 

Example 1 : 

Isolation of human rdgB proteins 

The yeast two-hybrid system was used to identify 
20 proteins that interact with the amino- terminal domain of 
PYK2 . The N- terminal domain of PYK2 was fused to the LexA 
DNA binding domain and screened a human brain cDNA 
library. Using a His synthetase gene (HIS3) under the 
control of LexA operators as a reporter, 124 His+ colonies 
25 were identified from an initial screen of a million 
transf ormants . Of these, 24 were also b-galactosidase 
positives (gal+) . Retransf ormat ion of these clones into 
a yeast strain expressing the LexA- PYK2 -N fusion protein 
indicated that only one interacts with the PYK2 N-terminal 
30 domain (PYK2-N) . The specificity of the interaction was 
further determined by transformation of this clone into a 
yeast strain expressing heterologous baits. An 
interaction was detected in yeast strain expressing either 
the PYK-N terminal domain, or a shorter version of PYK-N 
35 that was missing 48 amino acids from its C-terminal end. 
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No interaction, however, was detected in strains 
expressing either the PYK-NN (amino acids 2-285) , or the 
N-terminal domain of Fak, suggesting that this interaction 
is very specific. 
5 The clone that scored for specific interaction with 

PYK2-N contained a partial cDNA which allowed subsequent 
isolation of a 3 . 1 kb cDNA with an open reading f ram of 
975 amino acids. The coding region was flanked by 5' and 
3' untranslated regions of 93 and 149bp respectively. The 

10 5' untranslated region contains triplet repeats (CGG) , a 
motif that was identified in many neuropsychiatric 
disorders. This region showed homology to the 

untranslated region of the human Fragile X mental 
retardation FMR-1 gene (66.3% match) using the Smith- 

15 Waterman algorithm. 

A BLAST search with the full length cDNA sequence 
revealed that this protein is related to the drosophila 
retinal degeneration B protein (rdgB) and therefore it was 
named hrdgBl . The drosophila rdgB protein has an important 

20 role in phototransduct ion pathway. The rdgB mutant was 
initially identified by defects in the compound eye, in 
that rdgB mutant flies undergo light -enhanced 
photoreceptor cell degeneration. The drosophila rdgB 
protein contains a phosphat idylinositol transfer domain 

25 (PI -TP) in its N-terminal portion, and a calcium binding 
site downstream. The protein contains six hydrophobic 
regions that were identified as transmembrane domains. 
The same hydrophobic regions are conserved in the hrdgBl 
protein, however, analysis of rdgBl sequence, as well as 

30 the drosophila homolog, using different algorithms 
(PROSITE) indicated that they are not classical 
transmembrane domains . 

An ESTs data base search with drosophila rdgB 
sequence allowed the identification of two additional 

35 human genes that belong to the same gene family. A PCR 
fragment derived from an EST fragment (T12574) was used as 
probe to screen a human brain cDNA library and subsequent 



BNS0OCI0: <WO 98 1 6639 A 1 _!_> 



WO 98/16639 



PCT/US97/ 17374 



30 

isolation the hrdgB2 gene. The full length cDNA of hrdgB2 
(4186 bp) contained an open reading of 1244 amino acids 
which was flanked by a 5' untranslated region of 174bp and 
a 3' untranslated region of 280bp. The 257 amino-acids in 
5 the Nterminal end of the hrdgB2 protein have 41% 
similarity to the entire human PtdlnsTP (M73704) . 

The full length cDNA of hrdgB3 was obtained by 
screening human brain and heart cDNA libraries. An 
initial clone of 1 . 8kb was isolated from a human brain 

10 library using the PCR product derived from EST fragment 
(T12574) as a probe. A cDNA fragment derived from our 
1 . 8kb clone was used as a probe to screen a human heart 
cDNA library and allowed subsequent isolation of hrdgB3 
gene. Two isoforms arising from alternative splicing have 

15 been identified by cDNA cloning, the longest which encodes 
a protein of 1349 amino-acids with a predicted molecular 
weight of 150kDa, and a shorter one which lacks amino- 
acids 50-378, with a predicted molecular weight of 120kDa. 
The coding sequence is flanked by a 79bp 5' untranslated 

20 region and a 945 bp 3' untranslated region. The N-terminal 
region of hrdgB3 contains a PI-TP domain that is missing 
from the alternative spliced isoform. A strecht of 
glycines and serines was identified within amino acids 
612-634 (78% glycine, 22% serine) . 

25 Multiple alignment analysis of the novel hrdgBl, 

hrdgB2 and hrdgB3 revealed high similarity in their 
primary structure: a PI -TP domain in the amino- terminal 
region, six conserved hydrophobic regions and very 
conserved C- terminal region. Unlike the other rdgB family 

30 members, hrdgBl does not contain PtdlnsTP domain, this may 
suggest that our clone represent an alternative spliced 
isoform . 

Example 2 : 

Tissue distribution of human rdgBs 
3 5 The levels of hrdgBl, hrdgB2 and hrdgB3 mRNA 

expression v/ere determined by Northern analysis of various 
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human tissues. HrdgBl has a very restricted expression 
pattern. It is expressed in the brain, spleen and ovary 
as a message of approximately 7.5kb. By contrast, hrdgB2 
is highly expressed in many human tissues as a message of 
5 4.5 kb. Highest levels of expression were detected in the 
brain, heart, thymus and peripheral blood leukocytes. 
HrdgB3 is very highly expressed in the thymus, but it is 
also expressed in the heart, brain, ovary and testis. Two 
messages were detected for hrdgB3 : 7.5kb and 9 . 5kb 
10 messages that may represent the two alternative spliced 
isoforms that were isolated. The results discussed above 
indicate the rdgBs gene family members have very different 
expression patterns, whereas hrdgBl is very rare, hrdgB2 
is abundant and hrdgB3 has a unique pattern of expression. 

15 Example 3 : 

Mapping the minimal interaction domain of rdgB proteins 

To map the PYK2 interaction domain within the hrdgBl 
protein, a series of hrdgBl-deletion mutants were 
constructed and their ability to interact with PYK2-N was 

20 tested utilizing the two hybrid system. Our original two 
hybrid clone containing amino acids 627-975 of hrdgBl was 
used as a positive control. Deletion mutants were 
constructed, and among all these mutants, only hrdgBl- AIV, 
containing amino acids 627-936, interacts with PYK2-N 

25 terminal domain. The interaction of this domain with PYK2 
was further confirmed by an in vitro binding experiment, 
showing binding of PYK2 to immobilized GST- fusion protein 
containing the same portion of hrdgBl. No binding was 
detected, however, to the GST-protein alone or between 

30 hrdgBl-AIV mutant and the focal adhesion kinase. 

Since hrdgBl shares high homology with hrdgB2 and 
hrdgB3 in their C- terminal domains, whether the 
corresponding regions of these two proteins interact with 
PYK2 was examined. For this purpose amino acids 911-1244 

35 and 996-1350 of hrdgB2 and hrdgB3 respectively, were fused 
in frame to the activator domain of Gal-4, and their 
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ability to interact with PYK2-N was tested by the two 
hybrid system. The results indicate that hrdgB2 can 
strongly bind to PYK2 N- terminal domain, whereas the 
interaction of rdgB3 with PYK2 is quite weak. 

5 • To further confirm this interaction in vivo, hrdgB2- 

HA or hrdgB3-HA were coexpressed either with PYK2 or with 
Fak in COS cells. Following cell lysis, hrdgB proteins 
were immunoprecipitated by ant i -HA antibodies and the 
presence of PYK2 or Fak in the immunocomplexes was 

0 determined by immunoblott ing with antibodies against PYK2 
or Fak respectively. The results indicate that both 
hrdgB2 and hrsdgB3 interact with PYK2 in vivo. No 
interaction, however, was detected with the related kinase 
Fak, suggesting that hrdgBs proteins interact strongly and 

5 specifically with PYK2 . 

To explore whether the ' PYK2 binding domain' of 
hrdgBs is sufficient to confer association of those two 
proteins in vivo, a myc-tagged version of the hrdgB2 
x PYK2 -binding domain' was coexpressed either with PYK2 or 

0 with Fak in COS cells, and their interaction was analyzed. 
The results showed that this domain can interact with PYK2 
in vivo and therefore represent a separate domain in this 
family of proteins. 

Example 4 : 

5 In vivo association of rdcrBl and PYK2 

To confirm the interaction of hrdgBl and PYK2 in vivo 

an hemagglutinin- tagged rdgBl and PYK2 were coexpressed in 

293 cells. The results indicate that hrdgBl strongly 

associates with PYK2 . Association of hrdgBl with the 
0 related kinase Fak could not be detected under the same 

experimental conditions, suggesting a strong and specific 

interaction of hrdgBl and PYK2 . 

To further characterize the interaction between hrdgB 

and PYK2 , an adult rat brain was used as a source of these 
5 two proteins. When hrdgBl was immunoprecipitated from a 

rat brain homogenate, utilizing specific antibodies 
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against hrdgBl, PYK2 could be detected in the 
immunocomplex. However, the stochiometry of PYK2/rdgBl 
interaction was not as high as shown in transfected cells. 
These results indicate that PYK2 and rdgBl interact in 
5 vivo under physiological condition, and this interaction 
may have an important regulatory function in the brain. 

Other embodiments are within the following 

claims . 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) 



APPLICANT : 



Lev, Sima 



(ii) TITLE OF INVENTION: 



RDGB PROTEINS AND RELATED 
PRODUCTS AND METHODS 



(iii) NUMBER OF SEQUENCES: 



11 



(iv) CORRESPONDENCE ADDRESS : 



10 



15 



(A) 
(B) 

(O 
(D) 
(E) 
(F) 



ADDRESSEE : 
STREET: 

CITY: 
STATE : 
COUNTRY : 
ZIP : 



Lyon Sc Lyon 

633 West Fifth Street 

Suite 4700 

Los Angeles 

Cal if ornia 

U.S.A. 

90071-2066 



20 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 

(B) COMPUTER: 

(C) OPERATING SYSTEM: 

(D) SOFTWARE: 



3.5" Diskette, 1.44 Mb 

storage 

IBM Compatible 

IBM P.C. DOS 5.0 

FastSeq 



25 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 



To Be Assigned 



(vii) PRIOR APPLICATION DATA: 



(A) 
(B) 



APPLICATION NUMBER: 
FILING DATE: 



30 



(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: 

(B) REGISTRATION NUMBER: 

(C) REFERENCE /DOCKET NUMBER: 



Warburg, Richard J . 

32 , 327 

222/105 
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<ix) TELECOMMUNICATION INFORMATION : 



(A) TELEPHONE: 

(B) TELEFAX: 

(C) TELEX: 



(213) 489-1600 
(213) 955-0440 
67-3510 



5 (2) INFORMATION FOR SEQ ID NO : 1: 



(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 310 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 



GCGGCGGCGG CTGCGGTGGC GGCAGCGAGG 
CGGAGTCCGT TCGGGGCCGG AGGCGGTCGG 
GTAGCCGCCG GAGCCCGCCG CCCGGGACAT 

15 CCCGGGCGGC GGTGCCCCCT GGCACCTTCG 

AGATGATGAA TTCTTTGATG CCAGAGAGGA 
TGGGATGAGC CAGTGGAACT CCAATGACCT 
GG ACG AG CAT CAAGGAGAAG GGACCGCGCC 
GCGAGAACTG TACCGGGTTT CCTTGAGAAG 

2 0 GATCCACGAA GACAGCGAGG AAGGCTGCCC 
GCTGGTCCTG CATGGGGGAA ACATCCTGGA 
AGCCGACATC CACACCTTCA GCTCCGTGCT 
TGCCCTGGGC CACATCCTCA TCAAGTTCGT 
CTCGCTTGTC TCTCACCTGA ACCCCTACAG 

2 5 GGACCACGTC CCTCTGGCCG CCCTTCCCCT 

TGCTGTCGCC ACCGTCATCG AG CG AG C C AA 
TGATGGGATT GGCTTCAGTG GGCAGGTGTG 
GGCCTTCGAT GCCATCTGCT ACAGTGCGGG 
CCGGAAGGGG AG CAT C AG C A GCACCCAGGA 

3 0 CCTGGCCAGC AGCAAGCGTC TCAGCAAAAG 

TGAGGAGCCC AAGAGGCCGT TGCCGCGGAA 
GGCCATCACC CAGCACCATG CCTTCCTCTC 
GTCTGAGACC CCGGCGGCTG GGGGGCCGCA 
CTTCGATGTG TCCGACTTCT TCCTCTTCGG 

3 5 GAGGACGGTG CTGCCTGGGC TGGACGGCTT 

CAGCTTCTTC CATTGCGCAG ACCCCTCTGC 
GTTCCACCTG GTGCCGCCTG TCAGCGTGCC 
GCAGTCCCTC CTCCTCGCTG ATGCCCTACA 
CTCCCGGGAC AGCCCGCCAC TTCTGGATGC 

4 0 CCAGCGCCCA GGACGGAGGA TGAGCGAGGG 

GGACAGCATG GCACCCGTGG GTGCCTCCCG 
GATCGACTAT GCCCTGTACT GCCCTGATGT 
CCACCTCTTC CACGCCAGTT ACTGGGAGTC 
GGTAATGCGC TATGAGAGCG TGAACATCAA 

4 5 GAGTCCTGCC AACCCCCGGG AGAAGTGGCT 

TGTCACGGCT AATCACCGGG CCAATGATGT 
GGTGGGGCGG TTCATGTACG GGCCCCTCGA 
CATCCTAGTA ATGGCAGAGC CATCCTCAGG 
CAACAGCAGT GGTCGCATCA CATACAATGT 

5 0 CTATCCTGTG AAGATGGTCG TCAGGGGCGA 

GTTGCCCAGG GGCATGGAGT GTGTAGTGTT 
GTCTATCATG GGAAGCGACC CCAAGGTCCG 
GCAGGACTTG GGCTACATGA TCCTTTACAT 
GGTGGTGTCG TGGCTGTCCC AGCACAACTT 

55 GCTGGTGCAT GACCCGCTGC GGCAGAAGGC 
CTTCATCAAA ATCAGTGCGG CCTATGGCTC 
GGGCCTGCCT GCCTCCCAGA TCTTCATTGT 
GTGCCAGTTC CTGAGCGAGG GCTACGCCGC 
CTCACGCCCA AAGAAGAACA ACTCGCGCAT 

60 CGCGCAGCCA GAGTTCCTGC GG AAG CG C AA 
GCCCGACCCG CCCGCCGCCA ACCCCAAGCC 
CAAAGACCAC GAGCGGCCGC TGCCGGCGCT 
GTCGGTGCCC TGAGGGGTGG GCTGTGCTCA 



CGAGCGGGGC GGGGGCGCGG GCGCGGCGCT 6 0 

GGCCGGGCCC GGGAAGCGCG AGGAGCGCGC 120 

GGCCAAGGCG GGCCGTGCAG GTGGTCCTCC 180 

AAATGTCCTC AGTGACTCTG TGGAGAGCTC 24 0 

GATGGCTGAA GGGAAGAATG CCATCCTCAT 3 00 

CGTGGAGCAG ATCGAGACCA TGGGGAAACT 3 60 

GTGCACATCC AGCATCCTCC AGG AG AAG C A 4 20 

AC AG AG G T T C CCAGCCCAGG GAAGCATCGA 4 80 

GCAGCGCTCC TG C AAG AC AC ATGTCCTCCT 54 0 

CACGGGTGCC GGGGACCCGT CCTGCAAGGC 6 00 

GGAGAAGGTC ACACGAGCCC ATTTCCCTGC 660 

CCCCTGTCCT GCCATCTGCT CTGAGGCTTT 720 

CCACGATGAG GGCTGCCTCA GCAGCAGCCA 780 

GTTGGCCATC TCCTCCCCGC AG T AC C AGG A 84 0 

CCAGGTCTAC AG AG AG T T C C TGAAGTCCTC 900 

TCTCATCGGG GACTGTGTGG GGGGCCTCCT 96 0 

GCCCTCAGGG GACAGCCCTG CCAGCAGCAG 102 0 

CACCCCAGTC GCGGTGGAGG AAGATTGCAG 1080 

CAACATTGAC ATCTCCAGTG GGTTGGAGGA 114 0 

ACAGAGCGAC TCCTCCACCT ATGACTGCGA 12 00 

AAGCATCCAC TCCAGCGTGC TAAAGGATGA 1260 

GCTCCCTGAG GTCAGCCTGG GCCGCTTTGA 13 20 

CTCGCCACTG GGCCTGGTCC TGGCCATGCG 13 80 

CCAGGTGCGT CCTGCCTGCA GCCAGGTCTA 14 4 0 

CTCACGGCTC GAGCCACTGC TGGAGCCCAA 15 00 

TCGCTACCAG AGGTTCCCAC TGGGCGATGG 156 0 

CACCCACAGC CCCCTCTTCC TGGAGGGCAG 16 2 0 

CCCTGCCTCG CCCCCTCAGG CCTCGAGGTT 16 8 0 

GAGCTCCCAC AG CG AG AG CT CGGAGTCCTC 174 0 

CATCACAGCC AAGTGGTGGG G AAG C AAG AG 18 00 

CCTCACGGCC TTCCCCACCG TGGCCCTGCC 186 0 

CACAGACGTG GTGGCCTTCA TCCTGAGACA 192 0 

GGAAAGCGCC CGCCTGGACC CTGCAGCACT 19 80 

TCGTAAGCGG ACTCAGGTCA AGCTGAGGAA 2 04 0 

GATTGCTGCT GAAGATGGCC CCCAGGTCCT 2100 

CATGGTGGCT CTGACTGGAG AGAAGGTGGA 2160 

CCGCTGGGTA C A C C TG G AC A CAGAGATCAC 2 22 0 

GCCGCGGCCC CGGCGCCTGG GGGTTGGTGT 22 8 0 

CCAGACCTGT GCCATGAGCT ACCTCACGGT 2 34 0 

CAGCATTGAT GGGTCCTTCG CGGCCAGCGT 2 4 00 

GCCGGGTGCA GTGGATGTTG TCCGGCACTG 24 60 

CACGGGACGG CCGGACATGC AG AAG C AG CG 2 520 

CCCACAGGGC ATGATCTTCT TCTCCGACGG 2 580 

CATCTTCCTG CGCAACCTCA TGCAGGAGTG 2 64 0 

CACGAAGGAC ATCTCTGTCT ACAGCGTGCT 2 700 

GGGCCGGCCC A C C AAG AAG T ACCAAACCCA 2 76 0 

ACACCTGGCC GTGCTGGAGG CCAGCCACCG 2 820 

GATCCTGCGC AAG GG C AG C T TCGGGCTGCA 2 8 80 

CCACCTGCGC AGAACCATGT CAGTGCAGCA 2 94 0 

CGAGCGGGCC CAGAGCCAGC CCGAGTCGGA 3 000 

CAGCTGGG CG CGTGGGCCCC CCAAGTTCGA 3 060 

GAGCAGGGAG CGGGGGCCG 310 9 
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(2) INFORMATION FOR SEQ ID NO : 2: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4190 base pairs 

(B) TYPE: nucleic acid 
5 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2: 

CCGGCACTGC GCCTCGGGAG GGTCCGGCCA CCGCTGGAAC CCGAGGCCGG GGCTGGGGGC 6 0 

GCTCCGGGCT CCGACCCACG GGCCGGCCGG CCCTGCCCGG GCTGGGTGAG GGGCGCCCGC 12 0 

10 CTCAAGCTAG AGGAGGAGCG GAGGCCGCGC GCGGCCCGCC GAGCGCCTTC AGGATGCTCA 180 

TCAAGGAATA CCACATTCTG CTGCCCATGA GCCTGGACGA GTACCAGGTG GCCCAGCTCT 24 0 

ACATGATCCA G AAAAAG AG C CGGGAGGAGT CTAGTGGTGA GGGCAGCGGC GTGGAGATCC 3 00 

TGGCCAACCG GCCCTACACG GATGGGCCCG GGGGCAGCGG GCAATACACA CACAAGGTGT 3 60 

ACCACGTGGG CTCCCACATC CCAGGCTGGT TCCGGGCACT GCTGCCCAAG GCTGCCCTGC 4 20 

15 AGGTAGAAGA GGAATCCTGG AATGCCTACC CCTACACCCG AACCCGGTAC ACCTGCCCTT 4 80 

TCGTGGAGAA ATTCTCCATT GAAATTGAGA CCTATTACCT GCCTGATGGG GGGCAGCAGC 54 0 

CAAACGTCTT CAACCTGAGC GGGGCCGAGA GG AG ACAGCG CATCCTGGAC ACCATCGACA 6 00 

TCGTGCGGGA TGCAGTGGCC CCAGGCGAGT ACAAAGCAGA AGAGGACCCC CGG CTTTATC 66 0 

ACTCGGTCAA GACGGGCCGA GGGCCACTGT CTGATGACTG GGCACGGACG GCGGCACAGA 72 0 

2 0 CGGGGCCCCT TATGTGTGCC TATAAGCTGT GCAAGGTTGA GTTCCG CTAC TGGGGCATGC 78 0 

AAGCCAAGAT CG AG C AG T T C ATCCATGATG TAGGTCTGCG TCGGGTGATG CTGCGGGCCC 84 0 

ACCGCCAGGC CTGGTGCTGG CAGGATGAGT GGACAGAGCT GAGCATGGCT GACATCCGGG 900 

CACTGGAAGA GGAGACTGCT CGCATGCTGG CCCAGCGCAT GGCCAAGTGC AACACAGGCA 96 0 

GTGAGGGGTC CGAGGCCCAG CCCCCCGGGA AACCGAGCAC CGAGGCCCGG TCTGCGGCCA 102 0 

2 5 GCAACACTGG CACCCCCGAT GGGCCTGAGG CCCCCCCAGG CCCAGATGCC TCCCCCGATG 1080 

CCAGCTTTGG GAAGCAGTGG TCCTCATCCT CCCGTTCCTC CTACTCATCC CAACATGGAG 114 0 

GGGCTGTGTC TCCCCAGAGC TTGTCTGAGT GGCGCATGCA GAACATTGCC CGAGACTCTG 12 00 

AGAACAG CTC CGAGGAAGAG TTCTTTGATG CCCACGAAGG CTTCTCGGAC AGTGAGGAGG 12 60 

TCTTCCCCAA GGAGATGACC AAGTGGAACT CCAATGACTT CATTGATGCC TTTGCCTCCC 13 2 0 

3 0 CAGTGGAGGC AGAGGGAACG CCAGAGCCTG GAGCCGAGGC AGCTAAAGGC ATTGAGGATG 13 8 0 

GGGCCCAAGC ACCCAGGGAC TCAGAGGGCC TGGATGGAGC CGGGGAGCTG GGGGCTGAGG 144 0 

CATGCGCAGT CCACGCCCTC TTCCTTATCC TG C AC AG CGG CAACATCCTG GACTCAGGCC 150 0 

CTGGAGACGC CAACTCCAAG CAGGCGGATG TGCAGACGCT GAGCTCCGCC TTCGAGGCCG 156 0 

TCACCCGCAT CCACTTCCCT GAGGCCTTGG GCCACGTGGC GCTGCGACTG GTGCCCTGTC 1620 

3 5 CACCCATCTG CGCCGCCGCC TATGCCCTTG TCTCCAACCT GAG C C C TT AC AGCCACGATG 1680 

GGGACAGCCT GTCTCGCTCC CAAGACCACA TTCCACTGGC TGCCCTGCCA CTGCTGGCCA 174 0 

CCTCATCCTC CCGCTACCAG GGCGCCGTGG CCACCGTCAT TGCCCGCACC AACCAGGCCT 1800 

ACTCAGCCTT CCTGCGCTCA CCTGAGGGTG CCGGCTTCTG TGGGCAGGTC GCACTGATTG 1860 

GAGATGGTGT TGGTGGCATC CTGGGCTTTG ATGCACTCTG CCACAGTGCT AACGCGGGCA 192 0 

4 0 CCGGGAGTCG GGGCAGCAGC CGCCGTGGGA G C ATGAACAA TGAGCTGCTC TCTCCGGAGT 198 0 

TTGGCCCAGT GCGGGACCCC CTGGCAGATG GTGTGGAAGG CCTGGGTCGG GGCAGCCCAG 2 04 0 

AACCCTCGGC CTTGCCTCCC CAGCGCATCC C C AG CG AC AT GGCCAGTCCT GAGCCCGAGG 2100 

GCTCTCAGAA CAGCCTTCAG GCAGCCCCCG CAACCACCTC CTCCTGGGAG CCCCGGCGGG 216 0 

CAAGCACGGC CTTCTGCCCA CCCGCTGCCA GTTCCGAGGC ACCTGACGGC CCCAGCAGCA 22 2 0 

4 5 CTGCCCGCCT TGACTTCAAG GTCTCTGGCT TCTTCCTCTT CGGCTCCCCA CTGGGCCTGG 22 8 0 

TGCTGGCTCT GCGCAAAACT GTGATGCCCG CCCTGGAGGC AGCCCAGATG CGCCCAGCCT 234 0 

GTGAACAGAT CTACAACCTC TTCCACGCGG CCGACCCCTG CGCCTCACGC CTCGAGCCCC 24 0 0 

TGCTGGCCCC GAAGTTCCAG GCCATCGCCC CACTGACCGT GCCCCGCTAC CAGAAGTTCC 24 6 0 

CCCTGGGAGA TGGCTCATCC CTGCTGCTGG CCGACACTCT GCAGACGCAC TCCAGCCTCT 2 52 0 

5 0 TTCTGGAGGA G C TG G AG AT G CTGGTGCCCT CAACACCCAC CTCTACTAGC GGTGCCTTCT 258 0 

GG AAGGG C AG TGAGTTGGCC ACTGACCCCC CGGCCCAGCC AGCCGCCCCC AGCACCACCA 2 64 0 

GTGAGGTGGT TAAGATCCTG GAGCGCTGGT GGGGGACCAA G CGG AT CG AC TACTCGCTGT 270 0 

ACTGCCCCGA GGCGCTCACC GCCTTTCCCA CCGTCACGCT GCCCCACCTC TTCCACGCCA 276 0 

GCTACTGGGA GTCCGCCGAC GTGGTGGCGT TCATCCTGCG CCAGGTGATC GAG AAGG AG C 2 82 0 

5 5 GGCCACAGCT GGCGGAATGC GAGGAGCCGT CCATCTACAG CCCGGCCTTC CCCAGGGAGA 288 0 

AGTGGCAGCG AAAACGCACG CAGGTCAAGA TCCGGAACGT CACTTCCAAC CACCGGGCGA 2 94 0 

GCGACACGGT GGTGTGCGAG GGGCCGCCCC AGGTGCTAAG CGGGCGCTTC ATGTACGGGC 3 00 0 

CCCTGGACGT CGTCACGCTC ACTGGAGAGA AGGTGGATGT CTAC AT CATG ACGCAGCCGC 3 06 0 

TGTCGGGCAA GTGGATCCAC TTTGGCACCG AAGTCACCAA TAGCTCGGGC CGCCTCACCT 312 0 

6 0 TCCCAGTTCC CCCAGAACGC GCGCTGGGCA TTGGTGTCTA CCCCGTGCGC ATGGTGGTCA 3180 

GGGGCGACCA CACCTATGCC GAATGCTGCC TGACTGTGGT GGCCCGCGGC ACGGAGGCTG 3 24 0 

TGGTCTTCAG CATCGACGGC TCCTTCACCG CCAGCGTCTC CATCATGGGC AGCGACCCCA 3 3 00 

AGGTGCGAGC TGGCGCCGTG GACGTGGTCA GGCACTGGCA GG ACTCCGGC TACCTGATCG 336 0 

TGTATGTCAC AGGCCGGCCG GATATGCAGA AGCACCGCGT GGTGGCATGG CTGTCGCAGC 3 4 20 

6 5 ACAACTTCCC CCACGGCGTC GTCTCCTTCT GCGACGGCCT CACCCACGAC CCACTACGCC 3 4 80 

AG AAGG CAAT GTTTCTGCAG AGCCTGGTGC AGGAGGTAGA ACTGAACATC GTGGCCGGTT 3 54 0 

ATGGGTCTCC CAAAGATGTG GCTGTATACG CGGCGCTGGG GCTGTCCCCG AGCCAGACCT 3 60 0 
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ACATCGTGGG CCGTGCCGTG CGGAAGCTAC AGGCGCAGTG CCAGTTCCTG TCAGACGGCT 3660 

ATGTGGCCCA CCTGGGCCAG CTGGAAGCGG GCTCGCACTC GCATGCCTCC TCGGGACCCC 3 72 0 

CGAGAGCTGC CTTGGGCAAG AGCAGCTATG GTGTGGCTGC CCCCGTGGAC TTCCTGCGCA 3 78 0 

AAC AG AG CCA GCTGCTTCGC TCGAGGGGCC CCAGCCAGGC GGAGCGTGAG GGCCCGGGAA 3 84 0 

5 CACCACCCAC CACCCTGGCA CGGGGCAAAG CACGGAGCAT CAGCCTGAAG CTGGACAGCG 3 900 

AGGAGTGAGG CCCACACCAG CCTGGACCTG GGTTATTTAT TGACACACCC AAGGGGCCCG 3 96 0 

AGGGGCTGCG TGTGGGGAGG CTGGGGACCC AGACTTTTGG CCCCAGCGCT GGCCCCCCCA 4 020 

GCCCCACACC CTATATCTCC GTGTGCTCCT CGGTGTTACT TCCCTTTCAT ATGAGGGGAC 4 080 

CCAGCGCCGG GGGGAGGGAG GAGGGCGTGG GCATGGGCGC AG AGG CTTTT CCAGTGTGTA 414 0 

10 TAAATCCATG AAAATAAACG CCACCTGCAC CCTAAAAAAA AAAAGTCGAC 4 190 



(2) INFORMATION FOR SEQ ID NO: 3: 



(i) 



SEQUENCE CHARACTERISTICS: 



15 



(A) 
(B) 
(C) 
(D) 



LENGTH : 
TYPE: 

STRANDEDNESS : 
TOPOLOGY : 



5020 base pairs 
nucleic acid 
single 
linear 



(xi) 



SEQUENCE DESCRIPTION: 



SEQ ID NO: 3 : 



BNSOOCID: <WO 98 1 6639 A 1j _ > 
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CGGCCGCGT CGACAAGGAA CCTTGCCTAG 
GAAGGCTTGG ACTCCAAGAT GATTATAAAG 
GAG GAG TAC C GCATCGCCCA GCTGTACATG 
GGCGAAGGCA GCGGCGTGGA GATCCTGGAG 
5 TCTGGGCAGT ACACACACAA G G TG TAT CAT 
TCCATCCTGC CCAAGGCAGC CCTGCGGGTG 
ACCCGAACCA GGTTCACCTG TCCTTTCGTG 
TATAAAACTG ATGCTGGAGA AAACCCCGAC 
C AG C TG AC AA TCGACTTCAT CG AC AT TG T C 

10 ACAGAAGAGG ACCCCAAGCT GTTCCAGTCA 
AACTGGATCG AGGAGTACAA GAAGCAGGTC 
AAGGTGGAGT TCCGCTACTG GGGCATGCAG 
GGACTACGGA GGGTGATGGT GCGGGCTCAC 
TATGGGCTGA GCATGGAGAA CATCCGGGAG 

15 CGTAAGATGG CCCAGTTCAA TGAGGATGGT 

GCCGTCTCGG ACCAGACCTC TGGGGAGCCC 
CTAGTGGGGC GCGGCCTCAA GAAACAGTGG 
AAGCGGGGAG CGAGTCCTTC CCGCCACAGC 
AGGGACTCGG A TG AG AG C T C AGATGATGAG 

2 0 ACAGAGGAAA TGTTCCCCAA GGACATCACC 
ATCGAGAGCC CAGAGCCGGA AGACACACAA 
TTCAGGGTGG CCTCCAGTGT GGAGCAGCTG 
CTGGCTGCAC CGCCCTCCAA GATCCACGTG 
CTGGACACAG GCGCCGGGGA CCCCAGCTCC 

2 5 GTGTTCGACA CCGTCATGCG CGTGCACTAC 

CTGGTGCCCT GCCCGCCCGT CTGCTCTGAC 
TACAGCCATG ACGAAGGCTG TCTGTCCAGC 
CCCCTGCTGG CCACCTCCTC CCCCCAGTAC 
GCCAACCTTG CCTATGGGGA CTTCATCAAG 

3 0 GTCTGCCTGA TTGGGGACTG CGTCGGGGGC 

AACCAGCCGG TGTCTGAGAG T C AG AG C AG C 
GACAATGACC TGCTGTCCCC GGGCATCCTG 
GGTGGCGGCG GTGGCGGTGG TGGCAGCAGT 
CTGGAGAGCA GTCGGCACCT GAGCCGAAGC 

3 5 GAGGACCCCA AAAGGCAACT GCCCCGCAAG 

ACCATCCAGC AGCACCAGGC CTTCCTGTCC 
CCCTGCTCAC GCCATTCCAG CAGCTCCACC 
TTTGACTTTG AGATCACCGA CCTCTTCCTC 
TTGAGGAAGA CTGTCATCCC AGCCCTGGAT 

4 0 GTCTACAACC TCTTCCACCC CGCGGACCCG 

CGGCGCTTTC ACGCCCTGCC GCCTTTCAGC 
GATGGCTGCT CCACGCTGCT GGCGGATGTG 
CATGGCGCCC CCTCCTCGCC GGGCACTGCC 
GAG AT C AG C A TCGCCAGCCA GGTGTCAGGC 

4 5 GCCCAGAAGG CCCCCGATGC GCTCAGCCAT 

GCCCTGCCCG CCCCCAGCCC CACCACCCCT 
CCTGGCCTGG AGAGGGCCCC TGGCCTCCCT 
TGGTGGGGCC AGAAGCGGAT CGACTACGCC 
CCCACGGTGG CTCTGCCTCA CCTCTTCCAC 
50 TCCTTTCTGC TGAGACAGGT CATGAGGCAT 
AAGGAAGTGT CGGTGTTCAC CCCCTCAAAG 
CACGTGAAGC TG CGGAACGT GACGG CCAAC 
GACGGCCCCC AGGTTCTGAC GGGCAGGTTC 
ACTGGGGAGA AGGTGGATGT GCACATCATG 

5 5 CTGGATACGC TGGTGACCAA CAACAGTGGG 

CGCCTGGGCG TGGGTGTCTA CCCTATCAAG 
GACAGCTACA TCACCGTGCT GCCCAAGGGC 
TCCTTTGCCG CTAGCGTGTC CATCATGGGC 
GACGTGGTGC GGCACTGGCA GGACCTGGGC 

6 0 GACATGCAGA AGCAGCGGGT GGTGGCGTGG 

GTGTCCTTCT GTGACGGCCT GGTGCATGAC 
CTGCTCATCT CCGAGCTGCA CCTGCGCGTG 
G CGGTGT AC A GCGCCATTAG CCTGTCCCCC 
AAGAAGCTGC AGCAGCAGTG CCAGTTCATC 

6 5 CTGAAGTACA GCCACCGGGC GCGGCCCGCT 

AAGGGCAGCT TCGGCCTGCC CGGCCAGGGC 
CGCACCATCT CGGCCCAGCC CAGCGGGCCC 
GCGGATGGCG AGCAGCGGGG CCAGCGCAGC 
GCCATGACTG GCCGCCTGGA GCCGGGGGCA 

7 0 GCAGCGCGGG GTCTCCATGG TGCTAGGCCA 

TGGGCACACG CACTGACGTG GGCCTGGGAG 
CCGCACCACA CAGTGCTCCC TGCCCTGCCT 
TCATGGAAGC TGGCAGGGAC CACCAGCCCC 
CAGACTGGCC CGGG AGGTTC TCCCAGACAT 
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AAGTCCCAAC TTGCAGTTCC CCATCGACGG 6 0 

GAATATCGGA TTCCTCTGCC AATGACCGTG 12 0 

ATACAGAAGA AG AG C CG T AA CGAGACATAT 18 0 

AACCGGCCGT ACACAGATGG CCCAGGCGGC 24 0 

GTGGGCATGC A C ATT C C CAG CTGGTTCCGC 3 00 

GTGGAGGAGT CTTGGAATGC CTACCCCTAC 3 60 

GAGAAATTCT CCATCGACAT TGAAACCTTT 420 

GTGTTCAACC TCTCTCCTGT GGAAAAGAAC 4 80 

AAAGACCCTG TGCCCCACAA CGAGTATAAG 54 0 

ACCAAGACCC AGCGGGGGCC CCTGTCCGAG 60 0 

TTCCCCATCA TGTGCGCATA CAAGCTCTGC 66 0 

TCCAAGATCG AGAGGTTCAT CCACGACACC 72 0 

CGGCAGGCCT GGTGCTGGCA GGACG AG TGG 780 

CTGGAGAAGG AGGCACAGCT CATGCTTTCC 84 0 

GAGGAGGCCA CTGAGCTCGT CAAGCACGAA 9 00 

CCGGAGCCCA G CAG CAG CAA TGGGGAGCCC 96 0 

TCCACATCCT CCAAGTCGTC TCGGTCGTCC 1020 

ATCTCAGAGT GGAGGATGCA GAGTATTGCC 1080 

TTCTTCGATG CGCACGAGGA CCTGTCCGAC 114 0 

AAGTGGAGCT CCAATGACCT CATGGACAAG 12 00 

GATGGTCTGT ACCGCCAGGG TGCCCCTGAG 1260 

AACATCATAG AGGACGAGGT TAG C CAG C CG 13 2 0 

CTGCTATTGG TGCTGCACGG AGG CACCATC 13 80 

AAGAAGGGCG ATGCTAACAC CATCGCCAAC 14 4 0 

CCCAGCGCCC TGGGCCGCCT TGCCATCCGC 1500 

GCCTTTGCCC TGGTCTCCAA CCTCAGCCCC 1560 

AGTCAGGACC ACATTCCCCT GGCTGCCCTC 162 0 

C AGG AGG CAG TTGCCACAGT GATTCAGCGA 16 80 

TCCCAGGAGG GCATGACCTT CAATGGGCAG 174 0 

ATCCTGGCAT TTGATGCCCT GTGCTACAGT 18 00 

AGCCGCCGGG GCAGCGTGGT CAGCATGCAG 186 0 

ATGAATGCAG CACACTGCTG CGGTGGTGGC 192 0 

GGTGGTGGTG G CAG TAG TGG TGGCTCCAGC 198 0 

AACGTCGACA TCCCCCGCAG CAACGG CACT 2 04 0 

AGGAGCGACT CATCCACCTA CG AG CTGG AT 2100 

AGCCTCCATG CCAGCGTGCT GAGGACTGAG 216 0 

ATGCTGGATG GCACAGGTGC CCTGGGCAGG 222 0 

TTCGGGTGCC CGCTGGGGCT GGTCCTGGCC 228 0 

GTTTTCCAGC TGCGGCCGGC CTGCCAGCAA 234 0 

TCAGCTTCAC GCCTGGAGCC GCTGCTGGAA 24 0 0 

GTCCCCCGCT ACCAACGCTA CCCGCTGGGG 24 6 0 

CTCCAGACCC ACAATGCAGC CTTCCAAGAG 2 52 0 

CCTGCCAGTC GTGGCTTCCG C CG AG C CAG T 2 58 0 

ATGGCTGAGA GCTACACGGC AT C CAG CAT C 2 64 0 

ACCCCCAGCG TCAGGCGTCT GTCCCTGCTC 2 70 0 

GGCCCCCACC CTCCAGCCAG GAAGGCAAGC 2 76 0 

GAG CTGG AC A TTGGAGAAGT CGCTGCAAAG 2 82 0 

CTGTACTGCC CTGACGCCCT CACGGCCTTC 2 88 0 

GCCAGCTACT GGGAGTCAAC AG ACGTGGT C 2 94 0 

GACAACTCCA GCATCTTGGA GCTGGATGGC 3 0 00 

CCAAGGGAGA AGTGGCAGCG CAAGCGGACC 3 06 0 

CACCGGATCA ATGATGCCCT TGCCAATGAG 312 0 

ATGTATGGGC CCCTGGACAT GGTCACCCTG 3180 

ACCCAGCCGC CCTCAGGCGA GTGGCTCTAC 324 0 

CGTGTCTCCT ACACCATCCC TGAGTCGCAC 3 3 00 

ATGGTGGTCA GGGGAGACCA CACGTTTGCC 336 0 

ACAGAGTTCG TGGTCTTCAG CATCGACGGT 342 0 

AGCGACCCCA AGGTGCGGGC CGGGGCCGTG 3 4 80 

TACCTCATCA TCTACGTGAC GGGCCGGCCC 3 54 0 

CTGGCCCAGC ACAACTTCCC CCATGGCGTG 360 0 

CCGCTGCGGC ACAAGGCCAA CTTCCTGAAG 3 66 0 

CACGCGGCCT ATGGCTCCAC CAAGGACGTG 3 72 0 

ATGCAGATCT ACATCGTGGG CCGGCCCACC 3 78 0 

ACGGATGGCT ACGCGGCCCA CCTGGCGCAG 3 84 0 

CGCAACACGG CCACCCGCAT GGCGCTGCGC 3 90 0 

GACTTTCTGC GCTCCCGGAA CCACCTGCTT 3 96 0 

AGCCACCGGC ACGAGCGGAC AC AG AG CCAG 4 02 0 

ATGAGTGTGG CGGCCGGCTG CTGGGGCCGC 4 08 0 

GCCGCGGGCC CCAAGTAGGG C AC CG TG AG T 414 0 

GGGTGGCCAG CCCCGCCAGG AGGCCTGGCC 4 20 0 

ATTGTCCCAG GGCCTTGTGG AGGACACGGG 4 2 60 

CACGTCCTCG GGCCTGACGG GTCCGGCTTG 4 3 20 

AGGATGGCAG AGGGACCAGA ACCTCCCACT 43 80 

TTTGCCCTGT GTGGATCTCC AAGTGTCCTG 44 4 0 
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GTGCCAGGTG TGGGCCCAGG CGCAGCCTGC CACCTCCCCA TCCACTGGCC ACCCTCACTC 4 50 0 

CCAGGTCCCC TCCCATTTGG TAGCAGCTCC AACAGGGGTC CAGCCTGCAT CTTGTTAACT 4 56 0 

CGAGTTTCTC AACTGTTCAA CCTCACTGGT TTTGCACTGA TTTTTGAGAG CGGAGACCCA 4 62 0 

TTACCACCTC CTATGG CT AC AGCCCCGTTG ACATGCATGA AACTCAGTAC CTGCTGACCC 468 0 

5 AGGACCTACA ACCACACTGA AGGCTCCAGT GCGGCAGAGC CTCGTGCAAG CAGGAGAGAA 4 74 0 

AGGCTGTATC TTAATTTCTG CACCCCGGAC CCTGCCCACC TGTCTGCCTG CCCCGCCTGG 4 800 

AGCCCAGGCC AGTGTTGTTT CCAGCCTCAG GCCACGGGCT GGACGGGCCT GGCCGCCTCT 4 86 0 

TCCGCTCCCT GCCATCAGTC AAGGCCGCCC GCCCACGTTT CTACGCCTTT CTACTTCTCA 4 92 0 

ATCTGATTTC TATGAGGTTT TTTTAAACGA GCAATCCTTG GCTGCTTCCT TTTCTTAACT 4 98 0 

10 CTTTCAGTAC TG AG AG C AG C CCCTCCGTCG ACGCGGCCGC 502 0 



(2) INFORMATION FOR SEQ ID NO : 4: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1244 amino acids 

(B) TYPE: amino acid 
15 <C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
MAKAGR AGG P P PGGG AP WHLRN VLSDS VE S SDDE FFDAREEMAEGKNAI LIGMSQWNS 
20 NDLVEQIETMGKLDEHQGEGTAPCTSS ILQEKQRELYRVS LRRQRFPA 
QGSIEIHEDSEEGCPQRSCKTHVLLLVLHGGNILDTGAGDPSCKAJUDIHTFSSVLEKVTRAHFPAALG 
HILIKFVPCPAICSEAFSLVSHLNPYSHDEGCLSSSQDHVPLAALPLLAISSPQYQDAVATVIERANQ 
VYREFLKSSDGIGFSGQVCLIGDCVGGLLAFDAICYSAGPSGDSPASSSRKGSISSTQDTPVAVEEDC 
SLASSKRLSKSNIDISSGLEDEEPKRPLPRKQSDSSTYDCEAITQHHAFLSSIHSSVLKDESETPAAG 

2 5 GPQLPEVSLGRFDFDVSDFFLFGSPLGLVLAMRRTVLPGLDGFQVRPACSQVYSFFHCADPSASRLEP 

LLEPKFHLVPPVSVPRYQRFPLGDGQSLLLADALHTHSPLFLEGSSRDSPPLLDAPASPPQASRFQRP 
GRRMSEGSSHSESSESSDSMAPVGASRITAKWWGSKRIDYALYCPDVLTAFPTVALPHLFHASYWEST 
DVVAFILRQVMRYESVNIKESARLDPAALSPANPREKWLRKRTQVKLRNVTANHRANDVIAAEDGP 
LVGRFMYGPLDMVALTGEKVDILVMA£PSSGRWVHLDTEITNSSGRITYNVPRPRR 

3 0 RGDQTCAMSYLTVLPRGMECWFSIDGSFAASVSIMGSDPKVRPGAVDVVRHWQDLGYMILYITGRPD 

MQKQRWSWLSQHNFPQGMIFFSDGLVHDPLRQKAIFLRNLMQECFIKISAAYGSTKDISVYSVLGLP 
ASQIFIVGRPTKKYQTQCQFLSEGYAAHLAVLEASHRSRPKKNNSRMILRKGSFGLHAQPEFLRKRNH 
LRRTMSVQQPDPPAANPKPERAQSQPESDKDHERPLPALSWARGPPKFESVP 
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(2) INFORMATION FOR SEQ ID NO : 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1244 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
{ D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5: 



BNSDOCID: <WO 9816639A1_I_> 
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Met 


Leu 


He 




1 








Tyr 


Gin 


Val 


5 


Ser 


Ser 


Gly 








35 




Thr 


Asp 


Gly 






50 






Val 


Gly 


Ser 


10 


6 5 








Ala 


Leu 


Gin 




Thr 


Arg 


Tyr 


15 


Thr 


Tyr 


Tyr 








115 




Ser 


Gly 


Ala 






130 






Arg 


Asp 


Ala 


2 0 


14 5 








Leu 


Tyr 


His 




Ala 


Arg 


Thr 


25 


Cys 


Lys 


Val 








195 




Phe 


He 


His 






210 






Gin 


Ala 


Trp 


30 


22 5 








He 


Arg 


Ala 




Ala 


Lys 


Cys 


35 


Lys 


Pro 


Ser 








275 




Asp 


Gly 


Pro 






290 






Phe 


Gly 


Lys 


40 


305 








His 


Gly 


Gly 




Asn 


He 


Ala 


45 


Ala 


His 


Glu 








355 




Thr 


Lys 


Trp 






370 






Glu 


Ala 


Glu 


50 


385 








Glu 


Asp 


Gly 




Gly 


Glu 


Leu 



Lys Glu Tyr His 
5 

Ala Gin Leu Tyr 
20 

Glu Gly Ser Gly 

Pro Gly Gly Ser 
55 

His He Pro Gly 
70 

Val Glu Glu Glu 
85 

Thr Cys Pro Phe 
100 

Leu Pro Asp Gly 

Glu Arg Arg Gin 
135 

Val Ala Pro Gly 
150 

Ser Val Lys Thr 
165 

Ala Ala Gin Thr 
180 

Glu Phe Arg Tyr 

Asp Val Gly Leu 
215 

Cys Trp Gin Asp 
230 

Leu Glu Glu Glu 
245 

Asn Thr Gly Ser 
260 

Thr Glu Ala Arg 

Glu Ala Pro Pro 
295 

Gin Trp Ser Ser 
310 

Ala Val Ser Pro 
325 

Arg Asp Ser Glu 
340 

Gly Phe Ser Asp 

Asn Ser Asn Asp 
375 

Gly Thr Pro Glu 
390 

Ala Gin Ala Pro 
405 

Gly Ala Glu Ala 
420 
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He Leu Leu Pro Met 
10 

Met He Gin Lys Lys 
25 

Val Glu He Leu Ala 
40 

Gly Gin Tyr Thr His 
60 

Trp Phe Arg Ala Leu 
75 

Ser Trp Asn Ala Tyr 
90 

Val Glu Lys Phe Ser 
105 

Gly Gin Gin Pro Asn 
120 

Arg He Leu Asp Thr 
140 

Glu Tyr Lys Ala Glu 
155 

Gly Arg Gly Pro Leu 
170 

Gly Pro Leu Met Cys 
185 

Trp Gly Met Gin Ala 
200 

Arg Arg Val Met Leu 
220 

Glu Trp Thr Glu Leu 
235 

Thr Ala Arg Met Leu 
250 

Glu Gly Ser Glu Ala 
265 

Ser Ala Ala Ser Asn 
280 

Gly Pro Asp Ala Ser 
300 

Ser Ser Arg Ser Ser 
315 

Gin Ser Leu Ser Glu 
330 

Asn Ser Ser Glu Glu 
345 

Ser Glu Glu Val Phe 
360 

Phe He Asp Ala Phe 
380 

Pro Gly Ala Glu Ala 
395 

Arg Asp Ser Glu Gly 
410 

Cys Ala Val His Ala 
425 



Ser Leu Asp Glu 
15 

Ser Arg Glu Glu 
30 

Asn Arg Pro Tyr 
45 

Lys Val Tyr His 

Leu Pro Lys Ala 
80 

Pro Tyr Thr Arg 
95 

He Glu He Glu 
110 

Val Phe Asn Leu 
125 

He Asp He Val 

Glu Asp Pro Arg 
160 

Ser Asp Asp Trp 
175 

Ala Tyr Lys Leu 
190 

Lys He Glu Gin 
205 

Arg Ala His Arg 

Ser Met Ala Asp 
240 

Ala Gin Arg Met 
255 

Gin Pro Pro Gly 
270 

Thr Gly Thr Pro 
285 

Pro Asp Ala Ser 

Tyr Ser Ser Gin 
320 

Trp Arg Met Gin 
335 

Glu Phe Phe Asp 
350 

Pro Lys Glu Met 
365 

Ala Ser Pro Val 

Ala Lys Gly He 
400 

Leu Asp Gly Ala 
415 

Leu Phe Leu He 
430 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



Leu 


His 


Ser 


Gly 


Asn 


He 


Leu 


Asp 


Ser 


Gly 


Pro 


Gly 


Asp 


Ala 


Asn 


Ser 






435 










440 










445 








Lys 


Gin 


Ala 


Asp 


Val 


Gin 


Thr 


Leu 


Ser 


Ser 


Ala 


Phe 


Glu 


Ala 


Val 


Thr 




450 










455 










460 










Arg 


He 


His 


Phe 


Pro 


Glu 


Ala 


Leu 


Gly 


His 


Val 


Ala 


Leu 


Arg 


Leu 


Val 


465 










470 










475 










480 


Pro 


Cys 


Pro 


Pro 


He 


Cys 


Ala 


Ala 


Ala 


Tyr 


Ala 


Leu 


Val 


Ser 


Asn 


Leu 










485 










490 










495 




Ser 


Pro 


Tyr 


Ser 


His 


Asp 


Gly Asp 


Ser 


Leu 


Ser 


Arg 


Ser 


Gin 


Asp 


His 








500 










505 










510 






He 


Pro 


Leu 


Ala 


Ala 


Leu 


Pro 


Leu 


Leu 


Ala 


Thr 


Ser 


Ser 


Ser 


Arg 


Tyr 






515 










520 










525 






Gin 


Gly 


Ala 


Val 


Ala 


Thr 


Val 


He 


Ala 


Arg 


Thr 


Asn 


Gin 


Ala 


Tvr 


Ser 




530 










535 










540 








Ala 


Phe 


Leu 


Arg 


Ser 


Pro 


Glu 


Glv 


Ala 


Gly 


Phe 


Cys 


Gly 


Gin 


Val 


Ala 


545 










550 










555 










560 


Leu 


He 


Gly 


Asp 


Gly 


Val 


Glv 
j 


Glv 


He 


Leu 


Gly 


Phe 


Asp 


Ala 


Leu 


Cys 










565 










570 










575 




His 


Ser 


Ala 


Asn 


Ala 


Gly 


Thr 


Glv 


Ser 


Arg 


Gly 


Ser 


Ser 


Arg 


Arg 


Gly 








580 










585 










590 






Ser 


Met 


Asn 


Asn 


Glu 


Leu 


Leu 


Ser 


Pro 


Glu 


Phe 


Gly 


Pro 


Val 


Arg 


Asp 






595 










600 










605 




Pro 


Leu 


Ala 


Asp 


Gly 


Val 


Glu 


Glv 


Leu 


Gly 


Arg 


Gly 


Ser 


Pro 


Glu 


Pro 




610 










615 










620 










Ser 


Ala 


Leu 


Pro 


Pro 


Gin 


Arg 


He 


Pro 


Ser 


Asp 


Met 


Ala 


Ser 


Pro 


Glu 


625 










630 










635 










640 


Pro 


Glu 


Gly 


Ser 


Gin 


Asn 


Ser 


Leu 


Gin 


Ala 


Ala 


Pro 


Ala 


Thr 


Thr 


Ser 










645 










650 










655 




Ser 


Trp 


Glu 


Pro 


Arg 


Arg 


Ala 


Ser 


Thr 


Ala 


Phe 


Cys 


Pro 


Pro 


Ala 


Ala 








660 










665 










670 






Ser 


Ser 


Glu 


Ala 


Pro 


Asp 


Gly 


Pro 


Ser 


Ser 


Thr 


Ala 


Arg 


Leu 


Asp 


Phe 






675 










680 










685 








Lys 


Val 


Ser 


Gly 


Phe 


Phe 


Leu 


Phe 


Gly 


Ser 


Pro 


Leu 


Gly 


Leu 


Val 


Leu 




690 










695 










700 










Ala 


Leu 


Arg 


Lys 


Thr 


Val 


Met 


Pro 


Ala 


Leu 


Glu 


Ala 


Ala 


Gin 


Met 


Arcr 


705 










710 










715 










72 0 


Pro 


Ala 


Cys 


Glu 


Gin 


He 


Tvr 


Asn 


Leu 


Phe 


His 


Ala 


Ala 


Asp 


Pro 


Cys 










725 










730 










735 




Ala 


Ser 


Arg 


Leu 


Glu 


Pro 


Leu 


Leu 


Ala 


Pro 


Lys 


Phe 


Gin 


Ala 


He 


Ala 








740 










745 










750 






Pro 


Leu 


Thr 


Val 


Pro 


Arg 


Tvr 


Gin 


Lys 


Phe 


Pro 


Leu 


Gly 


Asp 


Gly 


Ser 






755 










760 










765 








Ser 


Leu 


Leu 


Leu 


Ala 


Asp 


Thr 


Leu 


Gin 


Thr 


His 


Ser 


Ser 


Leu 


Phe 


Leu 




770 










775 










780 










Glu 


Glu 


Leu 


Glu 


Met 


Leu 


Val 


Pro 


Ser 


Thr 


Pro 


Thr 


Ser 


Thr 


Ser 


Gly 


785 










790 










795 










800 


Ala 


Phe 


Trp 


Lys 


Gly 


Ser 


Glu 


Leu 


Ala 


Thr 


Asp 


Pro 


Pro 


Ala 


Gin 


Pro 










805 










810 










815 




Ala 


Ala 


Pro 


Ser 


Thr 


Thr 


Ser 


Glu 


Val 


Val 


Lys 


He 


Leu 


Glu 


Arg 


Trp 








820 










825 










830 


Trp 


Gly 


Thr 


Lys 


Arg 


He 


Asp 


Tyr 


Ser 


Leu 


Tyr 


Cys 


Pro 


Glu 


Ala 


Leu 






835 










840 










845 








Thr 


Ala 


Phe 


Pro 


Thr 


Val 


Thr 


Leu 


Pro 


His 


Leu 


Phe 


His 


Ala 


Ser 


Tvr 




850 










855 










860 








Trp 


Glu 


Ser 


Ala 


Asp 


Val 


Val 


Ala 


Phe 


He 


Leu 


Arg 


Gin 


Val 


He 


Glu 


865 










870 










875 










880 


Lys 


Glu 


Arg 


Pro 


Gin 


Leu 


Ala 


Glu 


Cys 


Glu 


Glu 


Pro 


Ser 


He 


Tyr 


Ser 










885 










890 










895 




Pro 


Ala 


Phe 


Pro 


Arg 


Glu 


Lys 


Trp 


Gin 


Arg 


Lys 


Arg 


Thr 


Gin 


Val 


Lys 








900 










905 










910 






He 


Arg 


Asn 


Val 


Thr 


Ser 


Asn 


His 


Arg 


Ala 


Ser 


Asp 


Thr 


Val 


Val 


Cys 






915 










920 










925 








Glu 


Gly 


Pro 


Pro 


Gin 


Val 


Leu 


Ser 


Gly 


Arg 


Phe 


Met 


Tyr 


Gly 


Pro 


Leu 




930 










935 










940 










Asp 


Val 


Val 


Thr 


Leu 


Thr 


Gly 


Glu 


Lys 


Val 


Asp 


Val 


Tyr 


He 


Met 


Thr 


945 










950 










955 










960 
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Gin Pro Leu Ser Gly Lys Trp lie His Phe Gly Thr Glu Val Thr Asn 

965 970 975 

Ser Ser Gly Arg Leu Thr Phe Pro Val Pro Pro Glu Arg Ala Leu Gly 
980 985 990 

5 lie Gly Val Tyr Pro Val Arg Met Val Val Arg Gly Asp His Thr Tyr 
995 1000 1005 

Ala Glu Cys Cys Leu Thr Val Val Ala Arg Gly Thr Glu Ala Val Val 

1010 1015 1020 

Phe Ser lie Asp Gly Ser Phe Thr Ala Ser Val Ser lie Met Gly Ser 
10 025 1030 1035 1040 

Asp Pro Lys Val Arg Ala Gly Ala Val Asp Val Val Arg His Trp Gin 

1045 1050 1055 

Asp Ser Gly Tyr Leu lie Val Tyr Val Thr Gly Arg Pro Asp Met Gin 
1060 1065 1070 

15 Lys His Arg Val Val Ala Trp Leu Ser Gin His Asn Phe Pro His Gly 
1075 1080 1085 

Val Val Ser Phe Cys Asp Gly Leu Thr His Asp Pro Leu Arg Gin Lys 

1090 1095 1100 

Ala Met Phe Leu Gin Ser Leu Val Gin Glu Val Glu Leu Asn lie Val 
20 105 mo 1115 1120 

Ala Gly Tyr Gly Ser Pro Lys Asp Val Ala Val Tyr Ala Ala Leu Gly 

1125 1130 1135 

Leu Ser Pro Ser Gin Thr Tyr lie Val Gly Arg Ala Val Arg Lys Leu 
1140 1145 1150 

25 Gin Ala Gin Cys Gin Phe Leu Ser Asp Gly Tyr Val Ala His Leu Gly 
1155 1160 1165 

Gin Leu Glu Ala Gly Ser His Ser His Ala Ser Ser Gly Pro Pro Arg 

1170 1175 1180 

Ala Ala Leu Gly Lys Ser Ser Tyr Gly Val Ala Ala Pro Val Asp Phe 
30 185 1190 1195 1200 

Leu Arg Lys Gin Ser Gin Leu Leu Arg Ser Arg Gly Pro Ser Gin Ala 

1205 1210 1215 

Glu Arg Glu Gly Pro Gly Thr Pro Pro Thr Thr Leu Ala Arg Gly Lys 
1220 1225 1230 

3 5 Ala Arg Ser lie Ser Leu Lys Leu Asp Ser Glu Glu 
1235 1240 
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(2) INFORMATION FOR SEQ ID NO : 6: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: amino acids 

(B) TYPE: amino acid 
5 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6: 
MIIKEYRIPLPMTVEEYRIAQLYMIQKKSRNETYGEGSGVEILENRPYTDGPGGSGQYTHK^A^GMHIPSWFRSILPKA 
1 0 ALRWEESWNAYPYTRTRFTCPFVEKFSIDIETFYKTDAGENPDVFNLSPVEKNQLTIDFIDIVKDPVPHNEYKTEEDPK 
LFQSTKTQRGPLSENWIEEYKKQVFPIMCAYKLCKVEFRY 

NIRELEKEAQLMLSRKMAQFNEDGEEATELVKHEAVSDQTSGEPPEPSSSNGEPLVGRGLKKQWSTSSKSSRSSKRGASP 
SRHSISEWRMQSIARDSDESSDDEFFDAHEDLSDTEEMFPKDITKWSSNDLMDKIESPEPEDTQDGLYRQGAPEFRVASS 
VEQLNIIEDEVSQPLAAPPSKIHVLLLVLHGGTILDTGAGDPSSKKGDAJ^TIANVFDTVMRVHYPSALGRLAIRLVPCPP 

15 VCSDAFALVSNLSPYSHDEGCLSSSQDHIPLAALPLLATSSPQYQEAVATVIQRANLAYGDFIKSQEGMTFNGQVCLIGD 
CVGGrLAFDALCYSNQPVSESQSSSRRGSWSMQDNDLLSPGILMNAAHCCGGGGGGGGGGGSSGGGGSSGGSSLESSRH 
LSRSNWIPRSNGTEDPKRQLPRKRSDSSTYELDTIQQHQAFLSSLHASVLRTEPCSRHSSSSTMLDGTGALGRFDFEIT 
DLFLFGCPLGLVToALRKTVIPALDVFQLRPACQQVYNLFHPADPSASRLEPLLERRFHALPPFSVPRYQRYPLGDGCSTL 
LADVLQTHNAAFQEHGAPSSPGTAPASRGFRRASEISIASQVSGMAESYTASSIAQKAPDAXjSHTPSVRRLSLLALPAPS 

2 0 PTTPGPHPPARKASPGLERAPGLPELDIGEVAAK^GQKRIDYALYCPDALTAFPWALPHLFHASYV/ESTDWSFLLRQ 
VMRHDNSSILELDGKEVSVFTPSKPREKWQRKRTHVKLRNVTANHRI^^ 
WIMTQPPSGEWLYLDTLVTNNSGRVSYTIPESHRLGVGVYPIK^^ 
SIMGSDPKVRAGAVDVWHWQDLGYLIIYVTGRPDMQKQRWAW^ 

HLRVHAAYGSTKDVAVYSAISLSPMQIYIVGRPTKKLQQQCQ FGL 
2 5 PGQGDFLRSRNHLLRTI SAQPSGPSHRHERTQSQADGEQRGQRSMSVAAGCWGRAMTGRLEPGAAAGPK 



(2) INFORMATION FOR SEQ ID NO : 7: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 98 6 amino acids 

(B) TYPE: amino acid 
3 0 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7: 



35 



Met 


Leu 


He 


Lys 


Glu 


Tyr 


Arg He 


Leu 


Leu 


Pro 


Met 


Thr 


Val 


Gin 


Glu 


1 






5 








10 










15 




Tyr 


Arg 


He 


Ala 


Gin 


Leu 


Tyr Met 


He 


Gin 


Lys 


Lys 


Ser 


Arg 


Leu 


Asp 








20 








25 










30 






Ser 


His 


Gly 


Gin 


Asp 


Ser 


Gly Val 


Glu 


He 


He 


Ser 


Asn 


Lys 


Pro 


Tyr 






35 








40 










45 








Thr 


Asp 


Gly 


Pro 


Gly 


Gly 


Ser Gly 


Gin 


Tyr 


Thr 


Phe 


Lys 


He 


Tyr 


His 




50 










55 








60 










lie 


Gly 


Ser 


Arg 


He 


Pro 


Ala Trp 


He 


Arg 


Thr 


Val 


Leu 


Pro 


Thr 


Asn 


65 








70 








75 










80 
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45 





Ala 


Leu 


Glu 


Ala 


His 


Glu 


Glu 


Ser 


Trp 


Asn 


Ala 


Tyr 


Pro 


Val 


Thr 


Lys 












85 










90 










95 




Thr 


Arg 


Tyr 


Ser 


Thr 


Pro 


Met 


Met 


Asp 


Arg 


Phe 


Ser 


Leu 


Glu 


Val 


Glu 










100 










105 










110 






5 


Thr 


Leu 


Tyr 


Phe 


Asp 


Asp 


His 


Gly 


Gin 


Gin 


Glu 


Asn 


Val 


Phe 


Asn 


Leu 








115 










120 










125 










Asn 


Glu 


Lys 


Asp 


Lys 


Ser 


Thr 


Arg 


He 


He 


Asp 


Tyr 


Met 


Asp 


Phe 


Val 






130 










135 










140 










10 


Lys 


Asp 


Pro 


He 


Ser 


Ser 


His 


Asp 


Tyr 


Cys 


Ala 


Glu 


Glu 


Asp 


Pro 


Lys 


145 










150 










155 










160 




Leu 


Tyr 


Arg 


Ser 


Glu 


Thr 


Thr 


Asn 


Arg 


Gly 


Pro 


Leu 


Asn 


Asp 


Asp 


Trp 












165 










170 










175 






Val 


Ala 


Glu 


His 


Leu 


Lys 


Lys 


Gly 


Leu 


Pro 


He 


Met 


Cys 


Ala 


Tyr 


Lys 










180 










185 










190 






15 


Leu 


Cys 


Lys 


Val 


Glu 


Phe 


Avg 


Tyr 


Trp 


Gly 


Met 


Gin 


Thr 


Arg 


Ala 


Glu 








195 










200 










205 










Arg 


Trp 


He 


His 


Asp 


Leu 


Ala 


Leu 


Arg 


Asn 


Thr 


Met 


Met 


Arg 


Ala 


His 






210 










215 










220 












Arg 


Gin 


Ala 


Trp 


Ala 


Trp 


Gin 


Asp 


Glu 


Trp 


Thr 


Gly 


Leu 


Thr 


Met 


Asn 


20 


225 










230 










235 










240 




Asp 


He 


Arg 


Lys 


Leu 


Glu 


Ala 


Glu 


Ala 


Ala 


Leu 


His 


Leu 


Ser 


Lys 


Val 












245 










250 










255 






Met 


Ser 


Val 


Lys 


Glu 


Asn 


Glu 


A sp 


Gly 


His 


Gin 


Asp 


Glu 


Asn 


Asp 


Thr 










260 










265 










270 






25 


Asp 


Asp 


Asp 


Met 


Asp 


Ala 


Gly 


Asp 


Ala 


Val 


Ser 


Asp 


Asp 


Leu 


Tyr 


Phe 








275 










280 










285 










Asp 


Cys 


Thr 


Asp 


Thr 


Ser 


Pro 


He 


Pro 


Thr 


Gin 


Lys 


Pro 


Ser 


He 


He 






290 










295 










300 












Arg 


Trp 


Ser 


Ser 


Glu 


Leu 


Glu 


Leu 


Glu 


He 


Gin 


Asp 


Asp 


Asn 


Ser 


Pro 


30 


305 










310 










315 










320 




Pro 


Leu 


Thr 


Pro 


His 


Asn 


Gly 


Ser 


Thr 


Glu 


Val 


Ala 


Leu 


Leu 


He 


Met 












325 










330 










335 






Val 


Phe 


His 


Gly 


Asp 


Phe 


Ser 


Pro 


Asp 


Asn 


Pro 


Ala 


Asp 


Ser 


Lys 


Thr 










340 










345 










350 






35 


Thr 


Asp 


Thr 


Asn 


Thr 


Phe 


Ser 


Ser 


Thr 


He 


Glu 


Thr 


Cys 


Val 


Gin 


Arg 








355 










360 










365 










His 


Tyr 


Pro 


Gin 


Leu 


Arg 


Asn 


Arg 


Leu 


His 


He 


Val 


Asn 


Val 


Ser 


Cys 






370 










375 










380 












Gly 


His 


Glu 


Met 


Thr 


Gin 


Val 


Val 


Ser 


Lys 


Leu 


Ser 


Asn 


He 


Ser 


Pro 


40 


385 










390 










395 










400 




Ser 


Phe 


Gly 


Leu 


Leu 


His 


Pro 


Ser 


Leu 


Ser 


Leu 


Met 


Leu 


Pro 


Ser 


Ala 












405 










410 










415 






Ser 


His 


Leu 


Tyr 


Asn 


Glu 


Ala 


Val 


Glu 


Gly 


Thr 


He 


Arg 


Arg 


Ala 


Asn 










420 










425 










430 






45 


Glu 


Thr 


Tyr 


Asn 


Glu 


Phe 


He 


Ala 


Ser 


Gin 


Pro 


Leu 


Phe 


Asn 


Gly Glu 








435 










440 










445 










Val 


Phe 


Val 


Val 


Gly 


Asp 


Cys 


Val 


Gly 


Gly 


He 


Phe 


Leu 


Tyr 


Glu 


Ala 






450 










455 










460 










50 


Met 


Thr 


Arg 


Lys 


Cys 


Asp 


Ser 


Met 


Thr 


Leu 


Leu 


Lys 


Arg 


Leu 


Ser 


Ser 


465 










470 










475 










480 




Asn 


Leu 


Ser 


Ser 


Arg 


He 


He 


Lys 


Glu 


Asp 


Gin 


Ser 


Pro 


His 


Gin 


Ser 












485 










490 










495 






Met 


Thr 


Asp 


He 


Thr 


He 


Thr 


Asp 


Thr 


Ser 


Ser 


He 


Ser 


Ser 


Cys 


Pro 










500 










505 










510 






55 


Gin 


Gin 


His 


Asn 


Gin 


Ser 


Val 


Arg 


Asp 


His 


Ser 


Ser 


Leu 


Gin 


Asn 


Gly 








515 










520 










525 










His 


Ala 


Ser 


Arg 


Arg 


Ser 


Ala 


Arg 


Asn 


Tyr 


Ser 


Ala 


Pro 


Pro 


Ser 


Ala 






530 










535 










540 










60 


Ser 


Tyr 


Val 


Gin 


He 


Asp 


Gly 


Leu 


Asp 


Ser 


Cys 


Gin 


Leu 


Phe 


Asn 


Leu 


545 










550 










555 










560 




Tyr 


Tyr 


Pro 


Leu 


Asp 


Pro 


Cys 


Gly 


Ala 


Arg 


He 


Glu 


Pro 


Val 


Leu 


Asp 












565 










570 










575 






Gly 


Gin 


Leu 


Ser 


Cys 


Val 


Pro 


Pro 


Tyr 


Asn 


Val 


Pro 


Lys 


Tyr 


Pro 


Leu 










580 










585 










590 






65 


Gly 


Asp 


Gly 


Lys 


Ser 


Gin 


Lys 


Phe 


Glu 


Ser 


Thr 


He 


Asp 


Ala 


Thr 


Gin 








595 










600 










605 
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Met 


Trp 


Gly 






610 






Met 


Val 


Val 




625 






5 


Ser 


Tyr 


Trp 




Val 


Arg 


Gly 




Asn 


Asn 


lie 


10 






675 




Lys 


Arg 


Thr 






690 






Asn 


Asp 


lie 




705 






15 


Phe 


Cys 


Tyr 




Ser 


Val 


Phe 




Phe 


Asp 


Thr 


20 






755 




Leu 


Pro 


Cys 






770 






Ser 


Tyr 


Leu 




785 






25 


Ala 


Val 


Phe 




Gly 


Lys 


Asp 




Trp 


Gin 


Glu 


30 






835 




Met 


Gin 


Gin 






850 






His 


Ala 


Leu 




865 






35 


Gin 


Lys 


Ser 




His 


Val 


Ala 




Gly 


Val 


Asp 


40 






915 




Asn 


Cys 


Val 






930 






Ser 


Gly 


Gin 




945 






45 


Leu 


Gin 


Leu 




Gly 


Gly 


Lys 



Ser 


Lys 


Arg 


He 








615 


Ala 


Leu 


Pro 


Ser 






630 




Glu 


Ser 


Cys 


Asp 




645 






Glu 


Glu 


Asn 


Ser 


660 








Pro 


Leu 


Asn 


He 


Arg 


Phe 


Lys 


He 








695 


Leu 


Val 


Thr 


Ala 






710 




Gly 


Pro 


Met 


Asp 




725 






Val 


Tyr 


Pro 


Gin 


740 








Asp 


Ser 


His 


Gly 


Gly 


lie 


His 


Ser 








775 


Asp 


Ala 


Phe 


Val 






790 




Ser 


Val 


Asp 


Gly 




805 






Pro 


Arg 


Val 


Arg 


820 








Gin 


Gly 


Tyr 


Leu 


Arg 


Val 


Val 


Ser 








855 


Leu 


Phe 


Phe 


Asn 






870 




Leu 


His 


Leu 


Arcr 




885 






Tyr 


Gly 


Ser 


Gly 


900 








Pro 


Glu 


His 


Val 


Gin 


He 


Glu 


Ser 








935 


Cys 


Thr 


Leu 


Gly 






950 




His 


Arg 


Asn 


Val 




965 






Phe 


Glu 


Asn 


Glu 



980 



46 



Asp 


Asn 


Leu 


Leu 


Tyr 
620 


Ser 


Ala 


Leu 


Pro 
635 


Asn 


Val 


Ala 


Ser 
650 


Phe 


Leu 


Val 


Leu 
665 


Thr 


Thr 


Leu 


Asp 


Leu 


Pro 


Thr 


Met 


680 










Ala 


Asn 


Leu 


Ser 


Ala 
700 


Gly 


Met 


Asp 


Leu 
715 


Thr 


Leu 


Val 


Ala 
730 


Leu 


Ser 


Arg 


Gly 
745 


Asp 


Tro 


Tyr 


Arg 


Leu 


Thr 


Leu 


Gin 


760 










Val 


Lys 


He 


Val 


Val 
780 


Ala 


He 


Val 


Pro 
795 


His 


Ser 


Leu 


Thr 
810 


Ala 


Ser 


Pro 


Gly 
825 


Ala 


Val 


Asp 


He 


He 


Tyr 


Leu 


Thr 


840 










Ala 


Trp 


Leu 


Ala 


Gin 
860 


Asn 


Ser 


Phe 


Ser 
875 


Thr 


His 


He 


Val 
890 


Asp 


Met 


Lys 


Asp 
905 


Val 


Asn 


Val 


He 


Ser 


Val 


Ala 


Gly 


920 










Tyr 


Ser 


Ser 


His 


Leu 
940 


Lys 


Arg 


He 


Glu 
955 


Asp 


Gin 


Arg 


Thr 
970 


Pro 


Ser 


Lys 


Asp 


Arg 







985 



Cys 


Pro 


Asn 


Ser 


He 


Leu 


His 


Ala 








640 


Leu 


Arg 


Gin 


Phe 






655 




Ser 


Ser 


Ser 


Met 




670 






His 


Trp 


Lys 


Arg 


685 








Asn 


His 


Arg 


Ala 


Val 


lie 


Ala 


Lys 








72 0 


Arg 


Glu 


Pro 


Val 






73 5 




Leu 


His 


Gly 


Val 




750 






Leu 


Ala 


Lys 


Thr 


765 








His 


Gly 


Asp 


Arg 


Gly 


Thr 


Lys 


Cys 








80 0 


Val 


Ser 


Val 


Thr 






815 




Val 


Val 


Arg 


Tyr 




830 






Ala 


Arg 


Pro 


Asp 


845 








His 


Asn 


Phe 


Pro 


Glu 


Pro 


Leu 


Lys 








88 0 


Gly 


Val 


His 


He 






895 




Tyr 


Thr 


Ser 


Ala 




910 






Ser 


Arg 


Arg 


Arg 


925 








Ala 


Ala 


Leu 


Asn 


Asp 


Gly 


Leu 


Thr 








960 


Phe 


Thr 


Pro 


Arg 






975 





(2) INFORMATION FOR SEQ ID NO : 8: 



50 



(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 

(B) TYPE: 

(C) STRANDEDNESS : 

(D) TOPOLOGY: 



4308 base pairs 
nucleic acid 
single 
linear 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8: 



60 



GCGGCCGCCA 
CGGAAAATTA 
ATTCCGACCA 
AAAAACGAAA 
TGACCAGCGG 
GCCGGAAAGG 



CAAACAAACA AACACACGGA CACACATCTG GACCTGTACA CCTACGGCCC 6 0 

TCCATAGAAC AACCGCTGAC TGACCCCGCC TCGTTTTTTC CAATTCCATC 12 0 

GGTCATAGAC GACGTGCCGC CACCCCACGC CAATCACCCC CCTCGCCACA 180 

AAAAAAACCG TCGGACGACA GCCACGTCGC GCCTTCACAT CATCCAGCCA 24 0 

CGGCAATCGA TGATTGCCAT TCCCTCAGCC AACGAGAGCC AATAGAGGCA 3 00 

AGGACGCCGG AATAGTCAGT CGGTATCGTC GGAAGAGTGC GCCATTCGCA 360 
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GAACGTCAAT AGCCGGAGGG GAGTCCGCCA 
GTCAACATGC TGATCAAGGA GTACCGCATT 
ATCGCCCAGC TCTACATGAT TGCGAAAAAG 
GGCGTTGAGA TAATCATCAA TGAGCCGTAC 
5 ACAAAGAAGA TCTATCACGT GGGCAATCAT 
AAAAGCGCTT TAACCGTGGA GGAGGAGGCC 
TACACCTGTC CGTTTGTGGA GAAATTCTCG 
AATGG CTATC AGGACAATGT CTTCCAGCTG 
GACGTAATTG A CATTG TCAA GGATCAGCTG 

10 CCCAAGCACT TTGTGTCGGA CAAGACGGGC 
GAGTATTGGC GCGAAGTGAA GGGCAAAAAG 
ACCGCCTACA AGATCTGCCG CGTGGAGTTT 
AAGTTCATCC ACGATGTGGC GCTGCGCAAG 
GCATGGCAGG ACGAGTGGTT CGGCTTGACC 

15 ACGCAACTGG CCCTGGCCAA GAAAATGGGC 
TCGGAGCCGT ATGTCAGCAC GGCGGCCACC 
AAGTCCGCTC CGGCTGTGCC GCCTATTGTC 
TCGGATGAGG AGGGCGAGGA GG AG GAG G AT 
ACGGGCGTGG ATCTGTCAGC CAACCAAGGC 

2 0 ATTCAAATGG CCCAGAAGGG CAAGTTCGGT 
TCTGCCCATA GCTTCGATCT CCAGGTGGCT 
TCCAAATCCA ATTCGGATGA GGAATTCTTT 
CTGGCCAAGT GGAGCTCGCT GGAGCTGCTT 
GGCGGACCCT CTAGTGCAGC ATCGGTGGGT 

2 5 ATATTCAATC AGGACTTTCT GATGCGCGTG 

CGTTCCTCGG CCAGCGTGGA TCGCAGTCAC 
CCGTCGTGTC CCACAACCAT TCTGATCCTG 
G C C AG CG AG C TGACCGCCAA GAAATCCGAT 
GTTATGCGAC ACGACTATCC CAGCCTCCTC 

3 0 CCCTCAATAT GCACCGACGC CCTGGGCATT 

GCGTCGCCCT CGGCGGCGGA TAT AC CG AAT 
CTACTATCTG TGGCATCGCC AGAATTCCAC 
AATATTGTCT GCCATGAGTT TTTGAAATCG 
GTCATGCTGG GCGATTCGAT GGGTTCGCTG 

3 5 GG C AG C C AG C CGGGCACGGC TTCGGGTGCC 

AT AAAT AC C C ACAATCCGTT GAGCCCACGT 
ATCGAAGCCG ATCTGGATGC CAAG CGTTTG 
TCCAGCTCAT CCAGCGATTC GCGTGCCACC 
ATGTTCGGAT CGCCGCTATC TGTGGTGCTG 

4 0 GCCCTGCCGC GGCCCAACTG CCACCAGGTC 

GCCTCGCGCC TGGAGCCGCT TCTGAGCGCC 
CCACGGTACG C CAAG TAT C C GCTGGGTAAT 
CAATCGCATC CGCAGCGCTT TAACGATGGC 
GACGCATCCA TGCAGAGCAC GATATCGGGT 

4 5 CATGCCCTGC AAAACAAATG GTGGGGCACA 
GAGGGATTGA GTAATTTCCC TGCTCACGCC 
G AG AG TCCGG ATGTGATTGC CTTTATTCTA 
TTTGTGGGCT CAAACGATGA CAAGGACAAT 
AAGTGGATTA AGAAACGGAC CTCGGTTAAG 

50 AACGATGTAA TCGTGCAGGA GGGCAGGGAG 
CCCCTGGACA TG AT C A CG C T GCACGGTGAA 
CCGGCGGGGC AGTGGACATT CCTCAGCACC 
T A C AG C ATT C CGGATCAGGT ATCCCTTGGC 
CGTGGCGATC ACACCTCGGT GGATTGCTAT 

55 TGGTCTTCAG CATTGATGGC T CAT T C AC CG 
AGGTGCGTGC CGGAGCTGTC GATGTTTGCC 
TTTACATCAC CGGACGACCG GATATGCAGC 
ACAACTTCCC GCACGGCCTG ATCTCGTTCG 
ACAAGACGGC CTATCTCAAC AATTTGGTTC 

6 0 CGGCAGCAGC AAGGACATTA GTGTCTACAC 
CATCGTGGGC AAGGTTGGCA AGAAGCTGCA 
TGCCGCCCAC TTGGCCGGTT TGCAGGCTGT 
CCGCATGGTC ATTCCACGCG GATGCTTCAA 
CAGAAGGCTG CATGAACAAG CAACGAATGA 

6 5 GTTTAGAGCA ATGAAAAACA ACAATTAAAG 
CAAAAACAAA ACATTACAGA CAATTGATGT 



47 



TTTCAACGAC AAGGACCCAA GTCACGCGGT 4 20 

CCGCTGCCCC TCACCGTCGA GGAGTACCGC 4 80 

AGTCGCGAGG AG AG C C ATGG CG AG G G C AG T 54 0 

AAGGATGGAC CCGGCGGTAA TGGTCAATAC 6 00 

CTGCCTGGCT GGATTAAAAG TCTCTTGCCG 6 60 

ATGGAATGCT ATCCGTATAC CAGGACTCGC 720 
CTGGATATTG AGACATACTA TTATCCGGAC 7 80 

TCCGGAAGCG ATTTGCGTAA TCGGATCGTA 840 
TGGGGCGGTG ACTATGTGAA GGAGGAGGAT 900 
CGTGGACCCT TGGCCGAGGA TTGGCTGGAG 9 60 

CAACCGACAC CGCGCAACAT GTCCCTGATG 1020 

CGCTACTGGG G CATG GAG AC AAAGCTGGAG 10 80 

ATGATGCTGC GGGCCCATCG GCAGGCGTGG 114 0 

AT CG AG GAT A TACGCGAGCT GG AG CG AC AG 12 00 

GGCGGCGAGG AGTGCAGCGA CGACAGCGTC 126 0 

GCCGCATCCA CAACGGGCAG CGAGCGAAAG 1320 

ACCCAGCAGC CGCCGAGCGC CGAGGCCAGT 13 80 

GACGACGAGG ACGAGAACGA TG C CAT TGG C 14 4 0 

GGATCCGCGC AGCGCTCGCG CTCCCAAAGC 1500 

TCAAAGGGTG CCCTTCACTC GCCGGTGGGA 1560 

AACTGGCGTA TGGAGCGATT GG AAG TGG AC 16 2 0 

GATTGCCTGG ACACCAATGA GACGAACTCG 16 80 

GGCGAGGGCG ACGACAGTCC GCCGCCACAT 174 0 

GGGCGTGGCA ACTCGCGGCA AG AG G AC AG C 1800 

GCCTCGGAGC GCGGCAACAA GCGGCAGTTA 1860 

GATTCATCGC CGCCGGGATC GCCGAGTACA 192 0 

GTTGTCCATG CGGGCAGCGT TTTGGATGCG 198 0 

GTGACCACAT TCCGTGGCTC CTTCGAGGCG 2 04 0 

ACCCATGTGA C CAT CAAG AT GGTGCCGTGC 2100 

CTCTCCAGCC TGAGTCCGTA CTCCTTTGAT 216 0 

ATAGCCGATG TCCCCATTGG AGCTATACCA 2 22 0 

GAGACGGTCA ACAAGACGGT TGCCGCTGCC 2 2 80 

GAGGAGGGTC ACGGATTCTC TG G C GAG AT T 2 34 0 

CTGGCGTACG AGGCCCTCTG CCGATCGAAT 2 4 00 

TCGAATTCCG GCGGAGATGC GGCCACAAAT 24 6 0 

AATTCGCGAT TGGACGATGA CGAGCGTTTC 2 520 

CTAGTGGCCC CATCGCCACG TAGACGCCGT 2 58 0 

AAATTGGACT TTGAGGTCTG TGACTTCTTC 2 64 0 

GCTGCAAGGA AACTTCACGA TGCCAAGGCC 2 700 

TACAATCTGT TCCATCCAAC CGATCCGATC 2 76 0 

CGGTTTTCTA TATTGGCGCC AGTCAATGTC 2 82 0 

GG AC AG C CAT TGCATTTATT GGAGGTCATT 2 880 

AATAACCTAT TGGCTGGTCG CCGTTTGTCG 2 94 0 

CTG ATT GAGA ATGTCTCGCT TAGTACGATC 3 000 

AAGCGCTTGG ATTACGCATT ATATTG CCCG 3 06 0 

TTGCCGCACC TCTTCCATGC CAGCTACTGG 3120 

CGGCAGATTG GCAAATTCGA GGGCATACCC 3180 

GCCTCCTTCC ATCCCGGACA GCCGAGGGAG 3 24 0 

C TG AAAAATG TAGCCGCCAA TCATCGGGCC 3 3 00 

C AG CG AT TG A ATGCGAGATT TATGTACGGA 3 3 60 

AAGGTGGATG TGCACATTAT GAAGGATCCG 3 4 20 

GAGGTGACGG ACAAGAATGG TCGCATCTCG 34 80 

TATGGTATAT ATCCGGTTAA GATGGTGGTC 3 54 0 

ATGGCGGTGG TGCCGCGTTA ACCGAATGCG 3 600 

CTTCGATGTC GGTGACAGGT AGGGATCCCA 3660 

GCCACTGGCA GGAGCTGGGC TACCTGCTCA 3 72 0 

AGCAACGCGT GGTGTCCTGG C TG AG C C AG C 3 780 

CCGACGGCCT GTCCACCGAT CCATTGGGCC 3 84 0 

AGAACCATGG AATCTCAATT ACTGCCCGTA 3 900 

GAATGTTGGC ATGCGAACCG ATCAAATTTT 3 960 

GTCGAATGCC ACCGTGCTTA GCGATGGCTA 4 020 

GGGTGGTTCG CGTCCGGCGA AGGGCAATGC 4080 

TCTTCCCGGC C AG AC CG C AA ATCCGCGGCG 414 0 

AAATTGAATT GCAACTCAAG CAAACCAATT 4 200 

CGCTTGTAAA C AG AT AG AAG ACGTTAAAAC 4 26 0 

TAG AAT TAG T GTTCTAGA 4 3 08 
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(2) INFORMATION FOR SEQ ID NO : 9: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1250 amino acids 

(B) TYPE: amino acid 
5 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9: 

Met Leu lie Lys Glu Tyr Arg lie Pro Leu Pro Leu Thr Val Glu Glu 
10 1 5 10 15 

Tyr Arg lie Ala Gin Leu Tyr Met lie Ala Lys Lys Ser Arg Glu Glu 

20 25 30 

Ser His Gly Glu Gly Ser Gly Val Glu lie He He Asn Glu Pro Tyr 
35 40 45 

15 Lys Asp Gly Pro Gly Gly Asn Gly Gin Tyr Thr Lys Lys He Tyr His 
50 55 60 

Val Gly Asn His Leu Pro Gly Trp He Lys Ser Leu Leu Pro Lys Ser 
65 70 75 80 

Ala Leu Thr Val Glu Glu Glu Ala Met Glu Cys Tyr Pro Tyr Thr Arg 
20 85 90 95 

Thr Arg Tyr Thr Cys P ro Phe Val Glu Lys Phe Ser Leu Asp lie Glu 

100 105 110 

Thr Tyr Tyr Tyr Pro Asp Asn Gly Tyr Gin Asp Asn Val Phe Gin Leu 
115 120 125 

2 5 Ser Gly Ser Asp Leu Arg Asn Arg He Val Asp Val He Asp He Val 

130 135 140 

Lys Asp Gin Leu Trp Gly Gly Asp Tyr Val Lys Glu Glu Asp Pro Lys 
145 150 155 160 

His Phe Val Ser Asp Lys Thr Gly Arg Gly Pro Leu Ala Glu Asp Trp 
30 165 170 175 

Leu Glu Glu Tyr Trp Arg Glu Val Lys Gly Lys Lys Gin Pro Thr Pro 

180 185 190 

Arg Asn Met Ser Leu Met Thr Ala Tyr Lys He Cys Arg Val Glu Phe 
195 200 205 

3 5 Arg Tyr Trp Gly Met Gin Thr Lys Leu Glu Lys Phe He His Asp Val 

210 215 220 

Ala Leu Arg Lys Met Met Leu Arg Ala His Arg Gin Ala Trp Ala Trp 
225 230 235 240 

Gin Asp Glu Trp Phe Gly Leu Thr He Glu Asp He Arg Glu Leu Glu 
40 245 250 255 

Arg Gin Thr Gin Leu Ala Leu Ala Lys Lys Met Gly Gly Gly Glu Glu 

260 265 270 

Cys Ser Asp Asp Ser Val Ser Glu Pro Tyr Val Ser Thr Ala Ala Thr 
275 280 285 

4 5 Ala Ala Ser Thr Thr Gly Ser Glu Arg Lys Lys Ser Ala Pro Ala Val 

290 295 300 

Pro Pro He Val Thr Gin Gin Pro Pro Ser Ala Glu Ala Ser Ser Asp 
305 310 315 320 

Glu Glu Gly Glu Glu Glu Glu Asp Asp Asp Glu Asp Glu Asn Asp Ala 
50 325 330 335 

He Gly Thr Gly Val Asp Leu Ser Ala Asn Gin Gly Gly Ser Ala Gin 

340 345 350 

Arg Ser Arg Ser Gin Ser He Gin Met Ala Gin Lys Gly Lys Phe Gly 
355 360 365 

55 Ser Lys Gly Ala Leu His Ser Pro Val Gly Ser Ala His Ser Phe Asp 
370 375 380 

Leu Gin Val Ala Asn Trp Arg Met Glu Arg Leu Glu Val Asp Ser Lys 
385 390 395 400 

Ser Asn Ser Asp Glu Glu Phe Phe Asp Cys Leu Asp Thr Asn Glu Thr 
60 405 410 415 
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Asn 


Ser 


Leu 


Ala 


Lys 


Trp 


Ser 


Ser 


Leu 


Glu 


Leu 


Leu 


Gly 


Glu 


Gly 


Asp 










420 










425 










430 








Asp 


Ser 


Pro 


Pro 


Pro 


His 


Gly 


Gly 


Pro 


Ser 


Ser 


Ala 


Ala 


Ser 


Val 


Gly 








435 










440 










445 








5 


Gly 


Arg 


Gly 


Asn 


Ser 


Arg 


Gin 


Glu 


Asp 


Ser 


He 


Phe 


Asn 


Gin 


Asp 


Phe 






450 










455 










460 












Leu 


Met 


Arg 


Val 


Ala 


Ser 


Glu 


Arg 


Gly 


Asn 


Lys 


Arg 


Gin 


Leu 


Arg 


Ser 




465 










470 










475 










480 




Ser 


Ala 


Ser 


Val 


Asp 


Arg 


Ser 


His 


Asp 


Ser 


Ser 


Pro 


Pro 


Gly 


Ser 


Pro 


10 










485 










490 










495 






Ser 


Thr 


Pro 


Ser 


Cys 


Pro 


Thr 


Thr 


He 


Leu 


He 


Leu 


Val 


Val 


His 


Ala 










500 










505 










510 








Gly 


Ser 


Val 


Leu 


Asp 


Ala 


Ala 


Ser 


Glu 


Leu 


Thr 


Ala 


Lys 


Lys 


Ser 


Asp 








515 










520 










525 








15 


Val 


Thr 


Thr 


Phe 


Arg 


Gly 


Ser 


Phe 


Glu 


Ala 


Val 


Met 


Arg 


His 


Asp 


Tyr 






530 










535 










540 












Pro 


Ser 


Leu 


Leu 


Thr 


His 


Val 


Thr 


He 


Lys 


Met 


Val 


Pro 


Cys 


Pro 


Ser 




545 










550 










555 










560 




He 


Cys 


Thr 


Asp 


Ala 


Leu 


Gly 


He 


Leu 


Ser 


Ser 


Leu 


Ser 


Pro 


Tyr 


Ser 


20 










565 










570 










575 






Phe 


Asp 


Ala 


Ser 


Pro 


Ser 


Ala 


Ala 


Asp 


He 


Pro 


Asn 


He 


Ala 


Asp 


Val 










580 










585 










590 








Pro 


He 


Gly 


Ala 


He 


Pro 


Leu 


Leu 


Ser 


Val 


Ala 


Ser 


Pro 


Glu 


Phe 


His 








595 










600 










605 








25 


Glu 


Thr 


Val 


Asn 


Lys 


Thr 


Val 


Ala 


Ala 


Ala 


Asn 


He 


Val 


Cys 


His 


Glu 






610 








615 










620 












Phe 


Leu 


Lys 


Ser 


Glu 


Glu 


Gly 


His 


Gly 


Phe 


Ser 


Gly 


Gin 


He 


Val 


Met 




625 










630 










635 










640 




Leu 


Gly 


Asp 


Ser 


Met 


Gly 


Ser 


Leu 


Leu 


Ala 


Tyr 


Glu 


Ala 


Leu 


Cys 


Arg 


30 










645 










650 










655 






Ser 


Asn 


Gly 


Ser 


Gin 


Pro 


Gly 


Thr 


Ala 


Ser 


Gly 


Ala 


Ser 


Asn 


Ser 


Gly 










660 










665 










670 








Gly 


Asp 


Ala 


Ala 


Thr 


Asn 


He 


Asn 


Thr 


His 


Asn 


Pro 


Leu 


Ser 


Pro 


Arg 




675 










680 










685 








35 


Asn 


Ser 


Arg 


Leu 


Asp 


Asp 


Asp 


Glu 


Arg 


Phe 


He 


Glu 


Ala 


Asp 


Leu 


Asp 






690 










695 










700 












Ala 


Lys 


Arg 


Leu 


Leu 


Val 


Ala 


Pro 


Ser 


Pro 


Arg 


Arg 


Arg 


Arg 


Ser 


Ser 




705 










710 










715 










720 




Ser 


Ser 


Ser 


Asp 


Ser 


Arg 


Ala 


Thr 


Lys 


Leu 


Asp 


Phe 


Glu 


Val 


Cys 


Asp 


40 










725 










730 










735 






Phe 


Phe 


Met 


Phe 


Gly 


Ser 


Pro 


Leu 


Ser 


Val 


Val 


Leu 


Ala 


Ala 


Arg 


Lys 










740 










745 










750 








Leu 


His 


Asp 


Ala 


Lys 


Ala 


Ala 


Leu 


Pro 


Arg 


Pro 


Asn 


Cys 


His 


Gin 


Val 








755 










760 










765 








45 


Tyr 


Asn 


Leu 


Phe 


His 


Pro 


Thr 


Asp 


Pro 


He 


Ala 


Ser 


Arg 


Leu 


Glu 


Pro 




770 










775 








780 












Leu 


Leu 


Ser 


Ala 


Arg 


Phe 


Ser 


He 


Leu 


Ala 


Pro 


Val 


Asn 


Val 


Pro 


Arg 




785 










790 










795 










800 




Tyr 


Ala 


Lys 


Tyr 


Pro 


Leu 


Gly 


Asn 


Gly 


Gin 


Pro 


Leu 


His 


Leu 


Leu 


Glu 


50 










805 










810 










815 






Val 


He 


Gin 


Ser 


His 


Pro 


Gin 


Arg 


Phe 


Asn 


Asp 


Gly 


Asn 


Asn 


Leu 


Leu 










820 










825 










830 








Ala 


Gly 


Arg 


Arg 


Leu 


Ser 


Asp 


Ala 


Ser 


Met 


Gin 


Ser 


Thr 


He 


Ser 


Gly 








835 










840 










845 








55 


Leu 


He 


Glu 


Asn 


Val 


Ser 


Leu 


Ser 


Thr 


He 


His 


Ala 


Leu 


Gin 


Asn 


Lys 






850 










855 










860 












Trp 


Trp 


Gly 


Thr 


Lys 


Arg 


Leu 


Asp 


Tyr 


Ala 


Leu 


Tyr 


Cys 


Pro 


Glu 


Gly 




865 










870 










875 










880 




Leu 


Ser 


Asn 


Phe 


Pro 


Ala 


His 


Ala 


Leu 


Pro 


His 


Leu 


Phe 


His 


Ala 


Ser 


60 










885 










890 










895 






Tyr 


Trp 


Glu 


Ser 


Pro 


Asp 


Val 


He 


Ala 


Phe 


He 


Leu 


Arg 


Gin 


He 


Gly 










900 










905 










910 








Lys 


Phe 


Glu 


Gly 


He 


Pro 


Phe 


Val 


Gly 


Ser 


Asn 


Asp 


Asp 


Lys 


Asp 


Asn 






915 










920 










925 








65 


Ala 


Ser 


Phe 


His 


Pro 


Gly 


Gin 


Pro 


Arg 


Glu 


Lys 


Trp 


He 


Lys 


Lys 


Arg 






930 










935 










940 
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Thr 


Ser 


Val 


Lys 


Leu 


Lys 


Asn 


Val 


Ala 


Ala 


Asn 


His 


Arg 


Ala 


Asn 


Asp 


945 










950 










955 










960 


Val 


lie 


Val 


Gin 


Glu 


Gly 


Arg 


Glu 


Gin 


Arg 


Leu 


Asn 


Ala 


Arg 


Phe 


Met 










965 










970 










975 




Tyr 


Gly 


Pro 


Leu 


Asp 


Met 


He 


Thr 


Leu 


His 


Gly 


Glu 


Lys 


Val 


Asp 


Val 








980 










985 










990 




His 


He 


Met 


Lys 


Asp 


Pro 


Pro 


Ala 


Gly 


Gin 


Trp 


Thr 


Phe 


Leu 


Ser 


Thr 






995 








1000 








1005 








Glu 


Val 


Thr 


Asp 


Lys 


Asn 


Gly Arg 


He 


Ser 


Tyr 


Ser 


He 


Pro 


Asp 


Gin 


1010 








1015 








1020 










Val 


Ser 


Leu 


Gly 


Tyr 


Gly 


He 


Tyr 


Pro 


Val 


Lys 


Met 


Val 


Val 


Arg 


Gly 


025 








1030 








1035 








1040 


Asp 


His 


Thr 


Ser 


Val 


Asp 


Cys 


Tyr 


Met 


Ala 


Val 


Val 


Pro 


Arg 


Xaa 


Thr 








1045 








1050 








1055 




Glu 


Cys 


Val 


Val 


Phe 


Ser 


He 


Asp 


Gly 


Ser 


Phe 


Thr 


Ala 


Ser 


Met 


Ser 






1060 








1065 








1070 






Val 


Thr 


Gly 


Arg 


Asp 


Pro 


Lys 


Val 


Arg 


Ala 


Gly 


Ala 


Val 


Asp 


Val 


Cys 




1075 








1080 








1085 








Arg 


His 


Trp 


Gin 


Glu 


Leu 


Gly 


Tyr 


Leu 


Leu 


He 


Tyr 


He 


Thr 


Gly 


Arg 


1090 








1095 








1100 










Pro 


Asp 


Met 


Gin 


Gin 


Gin 


Arg 


Val 


Val 


Ser 


Trp 


Leu 


Ser 


Gin 


His 


Asn 


105 








1110 








1115 








1120 


Phe 


Pro 


His 


Gly 


Leu 


He 


Ser 


Phe 


Ala 


Asp 


Gly 


Leu 


Val 


His 


Asp 


Pro 



1125 1130 1135 

2 5 Leu Gly His Lys Thr Ala Tyr Leu Gin Gin Leu Val Xaa Glu Pro Trp 

1140 1145 1150 

Asn Leu Asn Tyr Cys Pro Tyr Gly Ser Ser Lys Asp He Ser Val Tyr 

1155 1160 1165 

Thr Asn Val Gly Met Arg Thr Asp Gin He Phe He Val Gly Lys Val 
30 1170 1175 1180 

Gly Lys Lys Leu Gin Ser Asn Ala Thr Val Leu Ser Asp Gly Tyr Ala 
185 1190 1195 1200 

Ala His Leu Ala Gly Leu Gin Ala Val Gly Gly Ser Arg Pro Ala Lys 
1205 1210 1215 

3 5 Gly Asn Ala Arg Met Val He Pro Arg Gly Cys Phe Asn Leu Pro Gly 

1220 1225 1230 

Gin Thr Ala Asn Pro Arg Arg Arg Arg Leu His Glu Gin Ala Thr Asn 
1235 1240 1245 

Glu Asn 
40 1250 



(2) INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 
45 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Ser 
50 1 5 10 
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(2) INFORMATION FOR SEQ ID NO: 11: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 
5 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Glu Gin Lys Leu lie Ser Glu Glu Asp Leu 
10 1 5 10 
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Claims 



1. Isolated, purified, or enriched nucleic acid 
encoding a rdgB polypeptide. 



2. The nucleic acid of claim 1, wherein said rdgB 
5 polypeptide is a mammalian rdgB polypeptide. 

3. The nucleic acid of claim 2, wherein said 
mammalian rdgB polypeptide is a human rdgB polypeptide. 

4 . A nucleic acid probe for the detection of 
nucleic acid encoding a rdgB polypeptide in a sample. 

10 5 . Recombinant nucleic acid encoding a rdgB 

polypeptide and a vector or a promoter effective to 
initiate transcription in a host cell. 



6. An isolated, purified, recombinant, or enriched 
rdgB polypeptide. 



15 7. The rdgB polypeptide of claim 5, wherein said 

rdgB polypeptide is a mammalian rdgB polypeptide. 

8. The rdgB polypeptide of claim 6, wherein said 
rdgB polypeptide is a human rdgB polypeptide. 

9. A purified antibody having specific binding 
20 affinity to a rdgB polypeptide. 

10 . A hybridoma which produces an antibody having 
specific binding affinity to a rdgB polypeptide. 

11 . A method of detecting a compound capable of 
binding to a rdgB polypeptide comprising the steps of 

25 incubating said compound with said rdgB polypeptide and 
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detecting the presence of said compound bound to said rdgB 
polypeptide . 

12. A method of screening potential agents useful 
for treatment of a disease or condition characterized by 

5 an abnormality in a signal transduction pathway, wherein 
said signal transduction pathway includes an interaction 
between a rdgB polypeptide and a natural binding partner, 
comprising the step of assaying said potential agents for 
those able to promote or disrupt said interaction as an 
10 indication of a useful said agent. 

13. A method for diagnosis of a disease or condition 
characterized by an abnormality in a signal transduction 
pathway, wherein said signal transduction pathway includes 
an interaction between a rdgB polypeptide and a natural 

15 binding partner, comprising the step of detecting the 
level of said interaction as an indication of said disease 
or condition. 

14. A method for treatment of an organism having a 
disease or condition characterized by an abnormality in a 

20 signal transduction pathway, wherein said signal 
transduction pathway includes an interaction between a 
rdgB polypeptide and a natural binding partner comprising 
the step of promoting or disrupting said interaction. 

15. An isolated nucleic acid molecule comprising a 
25 nucleotide sequence that; 

(a) encodes a polypeptide having the full length 
amino acid sequence set forth in SEQ ID NO . : 4 , SEQ ID 
NO: 5, or SEQ ID NO : 6 ; 

(b) the complement of the nucleotide sequence 

30 of (a) or; 

(c) hybridizes under highly stringent 
conditions to the nucleotide sequence of (a) and encodes 
a naturally occurring rdgB protein. 
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16. A nucleic acid molecule comprising a nucleotide 
sequence that encodes 

(a) a rdgB protein having the full length amino 
acid sequence of sequence set forth in SEQ ID NO : 4 except 

5 that it lacks one of the following segments of amino acid 
residues: 1-616, or 616-974; 

(b) the complement of the nucleotide sequence 

of (a) ; 

(c) a rdgB protein having the full length amino 
10 acid sequence set forth in SEQ ID NO : 5 except that it 

lacks one of the following segments of amino acid 
residues: 1-250, 250-900, or 900-1243; 

(d) the complement of the nucleotide sequence 

of (c) ; 

15 (e) a rdgB protein having the full length amino 

acid sequence set forth in SEQ ID NO : 6 except that it 
lacks one of the following segments of amino acid 
residues: 1-251, 251-985, or 985-1349; or 

(f) the complement of the nucleotide sequence 

20 of (e) . 

17. A nucleic acid molecule comprising a nucleotide 
sequence that encodes 

(a) a polypeptide having an amino acid sequence 
set forth in SEQ ID NO : 4 from amino acid residues 1-616 or 

25 616-974; 

(b) the complement of the nucleotide sequence 

of (a) ; 

(c) a polypeptide having an amino acid sequence 
set forth in SEQ ID NO : 5 from amino acid residues 1-250, 

30 250-900, or 900-1243; 

(d) the complement of the nucleotide sequence 

of (c) ; 

(e) a polypeptide having an amino acid sequence 
of SEQ ID NO:6 from amino acid residues 1-251, 251-985, or 

35 985-1349; or 
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(f) the complement of the nucleotide sequence 

of (e) . 

18. An isolated nucleic acid molecule comprising a 
nucleotide sequence that encodes a polypeptide having the 

5 full length amino acid sequence set forth in SEQ ID NO : 4 ; 
SEQ ID NO: 5, or SEQ ID NO : 6 except that it lacks at least 
one, but not more than two, of the domains selected from 
the group consisting of the PIT domain, the central 
domain, the PYK2 binding domain, the calcium binding 
10 domain and the nucleotide binding domain. 

19. A recombinant vector containing the nucleotide 
sequence of any one of claims 14-18. 

20. A genetically engineered host cell containing 
the nucleotide sequence of any one of claims 14-18. 
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